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Preface 


This manual contains detailed solutions to all problems in Digital Image Processing, 2nd 
Edition. We also include a suggested set of guidelines for using the book, and discuss 
the use of computer projects designed to promote a deeper understanding of the subject 
matter. The notation used throughout this manual corresponds to the notation used in 
the text. 

The decision of what material to cover in a course rests with the instructor, and it de- 
pends on the purpose of the course and the background of the students. We have found 
that the course outlines suggested here can be covered comfortably in the time frames 
indicated when the course is being taught in an electrical engineering or computer sci- 
ence curriculum. In each case, no prior exposure to image processing is assumed. We 
give suggested guidelines for one-semester courses at the senior and first-year graduate 
levels. It is possible to cover most of the book in a two-semester graduate sequence. 

The book was completely revised in this edition, with the purpose not only of updating 
the material, but just as important, making the book a better teaching aid. To this 
end, the instructor will find the new organization to be much more flexible and better 
illustrated. Although the book is self contained, we recommend use of the companion 
web site, where the student will find detailed solutions to the problems marked with a 
star in the text, review material, suggested projects, and images from the book. One of 
the principal reasons for creating the web site was to free the instructor from having to 
prepare materials and handouts beyond what is required to teach from the book. 

Computer projects such as those described in the web site are an important part of 
a course on image processing. These projects give the student hands-on experience 
with algorithm implementation and reinforce the material covered in the classroom. 
The projects suggested at the web site can be implemented on almost any reasonably- 
equipped multi-user or personal computer having a hard copy output device. 
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1 Introduction 


The purpose of this chapter is to present suggested guidelines for teaching material from 
this book at the senior and first-year graduate level. We also discuss use of the book 
web site. Although the book is totally self-contained, the web site offers, among other 
things, complementary review material and computer projects that can be assigned in 
conjunction with classroom work. Detailed solutions to all problems in the book also 
are included in the remaining chapters of this manual. 


Teaching Features of the Book 

Undergraduate programs that offer digital image processing typically limit coverage to 
one semester. Graduate programs vary, and can include one or two semesters of the ma- 
terial. In the following discussion we give general guidelines for a one-semester senior 
course, a one-semester graduate course, and a full-year course of study covering two 
semesters. We assume a 15-week program per semester with three lectures per week. 
In order to provide flexibility for exams and review sessions, the guidelines discussed 
in the following sections are based on forty, 50-minute lectures per semester. The back- 
ground assumed on the part of the student is senior-level preparation in mathematical 
analysis, matrix theory, probability, and computer programming. 

The suggested teaching guidelines are presented in terms of general objectives, and not 
as time schedules. There is so much variety in the way image processing material is 
taught that it makes little sense to attempt a breakdown of the material by class period. 
In particular, the organization of the present edition of the book is such that it makes it 
much easier than before to adopt significantly different teaching strategies, depending 
on course objectives and student background. For example, it is possible with the new 
organization to offer a course that emphasizes spatial techniques and covers little or no 
transform material. This is not something we recommend, but it is an option that often 
is attractive in programs that place little emphasis on the signal processing aspects of the 
field and prefer to focus more on the implementation of spatial techniques. 
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2 Chapter 1 Introduction 

The companion web site 

www.prenhall.com/gonzalezwoods 

or 

www.imageprocessingbook.com 

is a valuable teaching aid, in the sense that it includes material that previously was cov- 
ered in class. In particular, the review material on probability, matrices, vectors, and 
linear systems, was prepared using the same notation as in the book, and is focused on 
areas that are directly relevant to discussions in the text. This allows the instructor to 
assign the material as independent reading, and spend no more than one total lecture pe- 
riod reviewing those subjects. Another major feature is the set of solutions to problems 
marked with a star in the book. These solutions are quite detailed, and were prepared 
with the idea of using them as teaching support. The on-line availability of projects 
and digital images frees the instructor from having to prepare experiments, data, and 
handouts for students. The fact that most of the images in the book are available for 
downloading further enhances the value of the web site as a teaching resource. 


One Semester Senior Course 

A basic strategy in teaching a senior course is to focus on aspects of image processing in 
which both the inputs and outputs of those processes are images. In the scope of a senior 
course, this usually means the material contained in Chapters 1 through 6. Depending 
on instructor preferences, wavelets (Chapter 7 ) usually are beyond the scope of coverage 
in a typical senior curriculum). However, we recommend covering at least some material 
on image compression (Chapter 8) as outlined below. 

We have found in more than two decades of teaching this material to seniors in electrical 
engineering, computer science, and other technical disciplines, that one of the keys to 
success is to spend at least one lecture on motivation and the equivalent of one lecture 
on review of background material, as the need arises. The motivational material is 
provided in the numerous application areas discussed in Chapter 1. This chapter was 
totally rewritten with this objective in mind. Some of this material can be covered in 
class and the rest assigned as independent reading. Background review should cover 
probability theory (of one random variable) before histogram processing (Section 3.3). 
A brief review of vectors and matrices may be required later, depending on the material 
covered. The review material included in the book web site was designed for just this 
purpose. 
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One Semester Senior Course 3 


Chapter 2 should be covered in its entirety. Some of the material (such as parts of 
Sections 2.1 and 2.3) can be assigned as independent reading, but a detailed explanation 
of Sections 2.4 through 2.6 is time well spent. 

Chapter 3 serves two principal purposes. It covers image enhancement (a topic of signif- 
icant appeal to the beginning student) and it introduces a host of basic spatial processing 
tools used throughout the book. For a senior course, we recommend coverage of Sec- 
tions 3.2.1 through 3.2.2; Section 3.3.1; Section 3.4; Section 3.5; Section 3.6; Section 
3.7.1, 3.7.2 (through Example 3.1 1), and 3.7.3. Section 3.8 can be assigned as indepen- 
dent reading, depending on time. 

Chapter 4 also discusses enhancement, but from a frequency-domain point of view. The 
instructor has significant flexibility here. As mentioned earlier, it is possible to skip 
the chapter altogether, but this will typically preclude meaningful coverage of other 
areas based on the Fourier transform (such as filtering and restoration). The key in 
covering the frequency domain is to get to the convolution theorem and thus develop 
a tie between the frequency and spatial domains. All this material is presented in very 
readable form in Section 4.2. “Light” coverage of frequency-domain concepts can be 
based on discussing all the material through this section and then selecting a few simple 
filtering examples (say, low- and highpass filtering using Butterworth filters, as discussed 
in Sections 4.3.2 and 4.4.2). At the discretion of the instructor, additional material can 
include full coverage of Sections 4.3 and 4.4. It is seldom possible to go beyond this 
point in a senior course. 

Chapter 5 can be covered as a continuation of Chapter 4. Section 5.1 makes this an easy 
approach. Then, it is possible give the student a “flavor” of what restoration is (and still 
keep the discussion brief) by covering only Gaussian and impulse noise in Section 5.2.1, 
and a couple of spatial filters in Section 5.3. This latter section is a frequent source of 
confusion to the student who, based on discussions earlier in the chapter, is expecting to 
see a more objective approach. It is worthwhile to emphasize at this point that spatial 
enhancement and restoration are the same thing when it comes to noise reduction by 
spatial filtering. A good way to keep it brief and conclude coverage of restoration 
is to jump at this point to inverse filtering (which follows directly from the model in 
Section 5.1) and show the problems with this approach. Then, with a brief explanation 
regarding the fact that much of restoration centers around the instabilities inherent in 
inverse filtering, it is possible to introduce the “interactive” form of the Wiener filter in 
Eq. (5.8-3) and conclude the chapter with Examples 5.12 and 5.13. 

Chapter 6 on color image processing is a new feature of the book. Coverage of this 
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4 Chapter 1 Introduction 


chapter also can be brief at the senior level by focusing on enough material to give the 
student a foundation on the physics of color (Section 6. 1 ), two basic color models (RGB 
and CMY/CMYK), and then concluding with a brief coverage of pseudocolor processing 
(Section 6.3). 

We typically conclude a senior course by covering some of the basic aspects of image 
compression (Chapter 8). Interest on this topic has increased significantly as a result of 
the heavy use of images and graphics over the Internet, and students usually are easily 
motivated by the topic. Minimum coverage of this material includes Sections 8.1.1 and 

8.1.2, Section 8.2, and Section 8.4.1. In this limited scope, it is worthwhile spending 
one -half of a lecture period filling in any gaps that may arise by skipping earlier parts of 
the chapter. 


One Semester Graduate Course (No Background in DIP) 

The main difference between a senior and a first-year graduate course in which neither 
group has formal background in image processing is mostly in the scope of material 
covered, in the sense that we simply go faster in a graduate course, and feel much freer 
in assigning independent reading. In addition to the material discussed in the previous 
section, we add the following material in a graduate course. 

Coverage of histogram matching (Section 3.3.2) is added. Sections 4.3, 4.4, and 4.5 
are covered in full. Section 4.6 is touched upon briefly regarding the fact that imple- 
mentation of discrete Fourier transform techniques requires non-intuitive concepts such 
as function padding. The separability of the Fourier transform should be covered, and 
mention of the advantages of the FFT should be made. In Chapter 5 we add Sections 5.5 
through 5.8. In Chapter 6 we add the HSI model (Section 6.3.2) , Section 6.4, and Sec- 
tion 6.6. A nice introduction to wavelets (Chapter 7) can be achieved by a combination 
of classroom discussions and independent reading. The minimum number of sections in 
that chapter are 7.1, 7.2, 7.3, and 7.5, with appropriate (but brief) mention of the exis- 
tence of fast wavelet transforms. Finally, in Chapter 8 we add coverage of Sections 8.3, 

8.4.2, 8.5.1 (through Example 8.16), Section 8.5.2 (through Example 8.20) and Section 

8.5.3. 

If additional time is available, a natural topic to cover next is morphological image 
processing (Chapter 9). The material in this chapter begins a transition from methods 
whose inputs and outputs are images to methods in which the inputs are images, but 
the outputs are attributes about those images, in the sense defined in Section 1.1. We 
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One Semester Graduate Course (with Background in DIP) 5 


recommend coverage of Sections 9.1 through 9.4, and some of the algorithms in Section 
9.5. 

One Semester Graduate Course (with Background in DIP) 

Some programs have an undergraduate course in image processing as a prerequisite to 
a graduate course on the subject. In this case, it is possible to cover material from the 
first eleven chapters of the book. Using the undergraduate guidelines described above, 
we add the following material to form a teaching outline for a one semester graduate 
course that has that undergraduate material as prerequisite. Given that students have the 
appropriate background on the subject, independent reading assignments can be used to 
control the schedule. 

Coverage of histogram matching (Section 3.3.2) is added. Sections 4,3, 4.4, 4.5, and 4.6 
are added. This strengthens the student’s background in frequency-domain concepts. 
A more extensive coverage of Chapter 5 is possible by adding sections 5.2.3, 5.3.3, 
5.4.3, 5.5, 5.6, and 5.8. In Chapter 6 we add full-color image processing (Sections 6.4 
through 6.7). Chapters 7 and 8 are covered as in the previous section. As noted in the 
previous section. Chapter 9 begins a transition from methods whose inputs and outputs 
are images to methods in which the inputs are images, but the outputs are attributes 
about those images. As a minimum, we recommend coverage of binary morphology: 
Sections 9.1 through 9.4, and some of the algorithms in Section 9.5. Mention should 
be made about possible extensions to gray-scale images, but coverage of this material 
may not be possible, depending on the schedule. In Chapter 10, we recommend Sections 

10.1, 10.2.1 and 10.2.2, 10.3.1 through 10.3.4, 10.4, and 10.5. In Chapter 1 lwe typically 
cover Sections 11.1 through 11.4. 

Two Semester Graduate Course (No Background in DIP) 

A full-year graduate course consists of the material covered in the one semester under- 
graduate course, the material outlined in the previous section, and Sections 12.1, 12.2, 

12.3.1, and 12.3.2. 


Projects 


One of the most interesting aspects of a course in digital image processing is the pictorial 
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6 Chapter 1 Introduction 

nature of the subject. It has been our experience that students truly enjoy and benefit 
from judicious use of computer projects to complement the material covered in class. 
Since computer projects are in addition to course work and homework assignments, we 
try to keep the formal project reporting as brief as possible. In order to facilitate grading, 
we try to achieve uniformity in the way project reports are prepared. A useful report 
format is as follows: 

Page 1: Cover page. 

• Project title 

• Project number 

• Course number 

• Student’s name 

• Date due 

• Date handed in 

• Abstract (not to exceed 1/2 page) 

Page 2: One to two pages (max) of technical discussion. 

Page 3 (or 4): Discussion of results. One to two pages (max). 

Results: Image results (printed typically on a laser or inkjet printer). All images must 
contain a number and title referred to in the discussion of results. 

Appendix: Program listings, focused on any original code prepared by the student. For 
brevity, functions and routines provided to the student are referred to by name, but the 
code is not included. 

Layout: The entire report must be on a standard sheet size (e.g., 8.5 x 11 inches), 
stapled with three or more staples on the left margin to form a booklet, or bound using 
clear plastic standard binding products. 

Project resources available in the book web site include a sample project, a list of sug- 
gested projects from which the instructor can select, book and other images, and MAT- 
LAB functions. Instructors who do not wish to use MATLAB will find additional soft- 
ware suggestions in the Support/Software section of the web site. 
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2 Problem Solutions 


Problem 2.1 


The diameter, x, of the retinal image corresponding to the dot is obtained from similar 
triangles, as shown in Fig. P2.1. That is, 

(d/2) Qr/2) 

0.2 0.014 

which gives x = 0.07c/. From the discussion in Section 2.1.1, and taking some liberties 
of interpretation, we can think of the fovea as a square sensor array having on the order of 
337,000 elements, which translates into an array of size 580 x 580 elements. Assuming 
equal spacing between elements, this gives 580 elements and 579 spaces on a line 1.5 
mm long. The size of each element and each space is then s = [(1.5mm)/l, 159] = 
1.3 x 10~ 6 m. If the size (on the fovea) of the imaged dot is less than the size of a single 
resolution element, we assume that the dot will be invisible to the eye. In other words, 
the eye will not detect a dot if its diameter, d , is such that 0.07(d) < 1.3 x 1 0 (l m, or 
d < 18.6 x 10 -6 m. 



Figure P2.1 
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8 Chapter 2 Problem Solutions 

Problem 2.2 

Brightness adaptation. 

Problem 2.3 

A = c/v = 2.998 x 10 8 (m/s)/60( 1/s) = 4.99 x 10 6 m = 5000 Km. 

Problem 2.4 

(a) From the discussion on the electromagnetic spectrum in Section 2.2, the source of 
the illumination required to see an object must have wavelength the same size or smaller 
than the object. Because interest lies only on the boundary shape and not on other spec- 
tral characteristics of the specimens, a single illumination source in the far ultraviolet 
(wavelength of .001 microns or less) will be able to detect all objects. A far-ultraviolet 
camera sensor would be needed to image the specimens, (b) No answer required since 
the answer to (a) is affirmative. 

Problem 2.5 

From the geometry of Fig. 2.3, 7mm/35mm= 2 / 500mm, or 2 = 100 mm. So the target 
size is 100 mm on the side. We have a total of 1024 elements per line, so the resolution 
of 1 line is 1024/100 = 10 elements/mm. For line pairs we divide by 2, giving an 
answer of 5 lp/mm. 

Problem 2.6 


One possible solution is to equip a monochrome camera with a mechanical device that 
sequentially places a red, a green, and a blue pass filter in front of the lens. The sUongest 
camera response determines the color. If all three responses are approximately equal, 
the object is white. A faster system would utilize three different cameras, each equipped 
with an individual filter. The analysis would be then based on polling the response of 
each camera. This system would be a little more expensive, but it would be faster and 
more reliable. Note that both solutions assume that the field of view of the camera(s) is 
such that it is completely filled by a uniform color [i.e., the camera(s) is(are) focused on 
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Problem 2.7 9 


a part of the vehicle where only its color is seen. Otherwise further analysis would be 
required to isolate the region of uniform color, which is all that is of interest in solving 
this problem] . 


Problem 2.7 


The image in question is given by 


f(x, y) = i( x, y)r(x,y) 

= 255e~ l(x ~ xo)2+(y ~ yo)2] (1.0) 
— 255 e -K x - x o) 2 +( y - y o) 2 l 


A cross section of the image is shown in Fig. P2.7(a). If the intensity is quantized using 
m bits, then we have the situation shown in Fig. P2.7(b), where AG = (255 + 1] /2 r " . 
Since an abrupt change of 8 gray levels is assumed to be detectable by the eye, it follows 
that AG = 8 = 256/2 m, or m = 5. In other words, 32, or fewer, gray levels will 
produce visible false contouring. 

Intensity 




(b) 

Figure P2.7 
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Problem 2.8 


The use of two bits (m = 2) of intensity resolution produces four gray levels in the range 
0 to 255. One way to subdivide this range is to let all levels between 0 and 63 be coded 
as 63, all levels between 64 and 127 be coded as 127, and so on. The image resulting 
from this type of subdivision is shown in Fig. P2.8. Of course, there are other ways to 
subdivide the range [0, 255] into four bands. 



Figure P2.8 


Problem 2.9 


(a) The total amount of data (including the start and stop bit) in an 8-bit, 1024 x 1024 
image, is (1024) 2 x [8 + 2] bits. The total time required to transmit this image over a 
At 56K baud link is (1024) 2 x [8 + 2]/56000 = 187.25 sec or about 3.1 min. (b) At 
750K this time goes down to about 14 sec. 


Problem 2.10 


The width-to-height ratio is 16/9 and the resolution in the vertical direction is 1125 lines 
(or, what is the same thing, 1125 pixels in the vertical direction). It is given that the 
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resolution in the horizontal direction is in the 16/9 proportion, so the resolution in the 
vertical direction is (1125) x (16/9) = 2000 pixels per line. The system “paints” a full 
1125 x 2000, 8-bit image every 1/30 sec for each of the red, green, and blue component 
images. There are 7200 sec in two hours, so the total digital data generated in this time 
interval is (1125)(2000)(8)(30)(3)(7200) = 1.166 x 10 13 bits, or 1.458 x 10 12 bytes 
(i.e., about 1.5 terrabytes). These figures show why image data compression (Chapter 
8) is so important. 


Problem 2.11 

Let p and q be as shown in Fig. P2.1 1. Then, (a) S\ and S 2 are not 4-connected because 
q is not in the set N^(p)', (b) Si and S 2 are 8-connected because q is in the set N$(p); 
(c) S 1 and S 2 are m-connected because (i) q is in Nn(p), and (ii) the set N^p) n N^(q) 
is empty. 
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Figure P2.ll 

Problem 2.12 

The solution to this problem consists of defining all possible neighborhood shapes to 
go from a diagonal segment to a corresponding 4-connected segment, as shown in Fig. 
P2. 12. The algorithm then simply looks for the appropriate match every time a diagonal 
segment is encountered in the boundary. 

Problem 2.13 


The solution to this problem is the same as for Problem 2.12 because converting from 
an m-connected path to a 4-connected path simply involves detecting diagonal segments 
and converting them to the appropriate 4-connected segment. 
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Chapter 2 Problem Solutions 






Problem 2.14 

A region R of an image is composed of a set of connected points in the image. The 
boundary of a region is the set of points that have one or more neighbors that are not in 
R. Because boundary points also are part of R, it follows that a point on the boundary 
has at least one neighbor in R and at least one neighbor not in R. (If the point in the 
boundary did not have a neighbor in /?, the point would be disconnected from R, which 
violates the definition of points in a region.) Since all points in R are part of a connected 
component (see Section 2.5.2), all points in the boundary are also connected and a path 
(entirely in R ) exists between any two points on the boundary. Thus the boundary forms 
a closed path. 

Problem 2.15 


(a) When V - {0, 1}, 4-path does not exist between p and q because it is impossible to 
get from p to q by traveling along points that are both 4-adjacent and also have values 
from V. Figure P2. 15(a) shows this condition; it is not possible to get to q. The shortest 
8-path is shown in Fig. P2. 15(b); its length is 4. The length of the shortest m- path 
(shown dashed) is 5. Both of these shortest paths are unique in this case, (b) One 
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possibility for the shortest 4-path when V = {1, 2} is shown in Fig. P2. 15(c); its length 
is 6. It is easily verified that another 4-path of the same length exists between p and q. 
One possibility for the shortest 8-path (it is not unique) is shown in Fig. P2. 15(d); its 
length is 4. The length of a shortest m-path ( shown dashed) is 6. This path is not unique. 


3 1 2 1 (?) 

2 2 0 2 

t 

12 11 

1 

(/<> 1 -0 -fl 2 

(a) 

3 1 2 1 (?) 

1 

2 2 0 2 

l—i -2—1 — 1 

t 

Wl 0 1 2 

(c) 


3 1 2 1 (?) 

/* 

2 2 0 ' 2 

t: 

12 11 

/ : 

ip) 1 .6.0 -1 2 

(b) 

3 1" -2'^T (?) 

2 2 0 2 

ft 

1 — ► 2 1 1 

f X 

Wl 0 1 2 

(d) 


Figure P2.15 


Problem 2.16 


(a) A shortest 4-path between a point p with coordinates (x, y ) and a point q with coor- 
dinates (s, f) is shown in Fig. P2.16, where the assumption is that all points along the 
path are from V. The length of the segments of the path are \x — s| and | y — t\, respec- 
tively. The total path length is |a; — s| -H- | y — t\, which we recognize as the definition 
of the D \ distance, as given in Eq. (2.5-16). (Recall that this distance is independent of 
any paths that may exist between the points.) The D 4 distance obviously is equal to the 
length of the shortest 4-path when the length of the path is \x — s| + y t \ . This oc- 
curs whenever we can get from p to q by following a path whose elements ( 1 ) are from 
V, and (2) are arranged in such a way that we can traverse the path from p to q by mak- 
ing turns in at most two directions (e.g., right and up), (b) The path may of may not be 
unique, depending on V and the values of the points along the way. 
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7 (s, 0 


P 

(*,}’) 

Figure P2.16 


Problem 2.17 

(a) The D 8 distance between p and q (see Fig. P2.16)is defined asmax(|a: — s| ,\y— f|). 
Recall that the D H distance (unlike the Euclidean distance) counts diagonal segments the 
same as horizontal and vertical segments, and, as in the case of the D 4 distance, is inde- 
pendent of whether or not a path exists between p and q. As in the previous problem, the 
shortest 8-path is equal to the D 8 distance when the path length is max (\x — s| , \y — 1 1). 
This occurs when we can get from p to q by following a path whose elements ( 1 ) are 
from V, and (2) are arranged in such a way that we can traverse the path from p to q by 
by traveling diagonally in only one direction and. whenever diagonal travel is not possi- 
ble, by making turns in the horizontal or vertical (but not both) direction, (b) The path 
may of may not be unique, depending on V and the values of the points along the way. 

Problem 2.18 


With reference to Eq. (2.6-1), let H denote the neighborhood sum operator, let Si and 
S 2 denote two different small subimage areas of the same size, and let S 1 +S 2 denote the 
corresponding pixel-by-pixel sum of the elements in Si and S 2 , as explained in Section 
2.5.4. Note that the size of the neighborhood (i.e.. number of pixels) is not changed by 
this pixel-by-pixel sum. The operator H computes the sum of pixel values is a given 
neighborhood. Then, H(aSi + bS-i ) means: (1) multiplying the pixels in each of the 
subimage areas by the constants shown, (2) adding the pixel-by-pixel values from Si and 
S 2 (which produces a single subimage area), and (3) computing the sum of the values 
of all the pixels in that single subimage area. Let api and hp-j denote two arbitrary (but 
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corresponding ) pixels from aS\ + bS? . Then we can write 

H(aSi + bS 2 ) = X api+bp 2 

pi€Si and p2^^2 

= X api + X] bp2 

PlG5i P2^^2 


= a X Pt + b X P2 

pieSi p 2 es 2 

= aiT(5i) +6iT(5 2 ) 


which, according to Eq. (2.6-1), indicates that H is a linear operator. 


Problem 2.19 


The median, £, of a set of numbers is such that half the values in the set are below ( and 
the other half are above it. A simple example will suffice to show that Eq. (2.6-1) is vi- 
olated by the median operator. Let Si = {1, —2, 3}, S2 = {4, 5, 6}, and a = b = 1. 
In this case H is the median operator. We then have H(S± + S 2) =median{5, 3, 9} = 
5, where it is understood that Si -I- S2 is the element-by-corresponding-element sum 
of Si and S2. Next, we compute iT(Si) = medianjl, — 2, 3} = 1 and // ( 6’ 2 ) = 
median{4, 5, 6} = 5. Then, since H(aSi + 6S2) oiT(Si) + 6iT(S2), it follows 
that Eq. (2.6-1 ) is violated and the median is a nonlinear operator. 


Problem 2.20 


The geometry of the chips is shown in Fig. P2. 20(a). From Fig. P2.20(b) and the 
geometry in Fig. 2.3, we know that 


where Ax is the side dimension of the image (assumed square since the viewing screen 
is square) impinging on the image plane, and the 80 mm refers to the size of the viewing 
screen, as described in the problem statement. The most inexpensive solution will result 
from using a camera of resolution 512 x 512. Based on the information in Fig. P2.20(a), 
a CCD chip with this resolution will be of size (16/r) x (512) = 8 mm on each side. 
Substituting Ax = 8 mm in the above equation gives 2 = 9 A as the relationship between 
the distance 2 and the focal length of the lens, where a minus sign was ignored because 
it is just a coordinate inversion. If a 25 mm lens is used, the front of the lens will have 
to be located at approximately 225 mm from the viewing screen so that the size of the 
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image of the screen projected onto the CCD image plane does not exceed the 8 mm size 
of the CCD chip for the 512 x 512 camera. This value for z is reasonable, but it is 
obvious that any of the other given lens sizes would work also; the camera would just 
have to be positioned further away. 



M 



Assuming a 25 mm lens, the next issue is to determine if the smallest defect will be 
imaged on, at least, a 2 x 2 pixel area, as required by the specification. It is given that 
the defects are circular, with the smallest defect having a diameter of 0.8 mm. So, all that 
needs to be done is to determine if the image of a circle of diameter 0.8 mm or greater 
will, at least, be of size 2x2 pixels on the CCD imaging plane. This can be determined 
by using the same model as in Fig. P2. 20(b) with the 80 mm replaced by 0.8 mm. Using 
A — 25 mm and z — 225 mm in the above equation yields Aa: = 100 /;. In other words, 
a circular defect of diameter 0.8 mm will be imaged as a circle with a diameter of 1 00 // 
on the CCD chip of a 512 x 512 camera equipped with a 25 mm lens and which views 
the defect at a distance of 225 mm. 

If, in order for a CCD receptor to be activated, its area has to be excited in its entirety, 
then, it can be seen from Fig. P2. 20(a) that to guarantee that a 2 x 2 array of such 
receptors will be activated, a circular area of diameter no less than (6) (8) = 48 /i has to 
be imaged onto the CCD chip. The smallest defect is imaged as a circle with diameter 
of 100 /./, which is well above the 48 /j minimum requirement. 

Thus, it is concluded that a CCD camera of resolution 512 x 512 pixels, using a 25 mm 
lens and imaging the viewing screen at a distance of 225 mm, is sufficient to solve the 
problem posed by the plant manager. 
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Problem 3.1 

(a) General form: s = T(r) = Ae~ Kr . For the condition shown in the problem figure, 
Ae~ KL ° = A/2. Solving for K yields 

-KL\ = ln(0.5) 

K = 0.693/Lg. 

Then, 

_ 0.693 r 2 

s = T{r) = Ae L o 

(b) General form: s = T(r) = B( 1 — e Kr ). For the condition shown in the problem 
figure, B( 1 — e~ KL °) = B / 2. The solution for K is the same as in (a), so 

_ 0.693 2 

s = T(r) = B{ 1 — e L o ) 

(c) General form: s = T(r ) = (D — C){1 — e~ Kr ~) + C. 

Problem 3.2 


(a) s = T(r) = 1+(m 1 /r)K - 

(b) See Fig. P3.2. 


(c) We want the value of s to be 0 for r < m, and s to be 1 for values of r > m. When 

r = m, s = 1/2. But, because the values of r are integers, the behavior we want is 

0.0 when r < to — 1 

s = T(r) = 0.5 when r = to 

1.0 whenr>TO + l. 

The question in the problem statement is to find the smallest value of E that will make 
the threshold behave as in the equation above. When r = to, we see from (a) that 
s = 0.5, regardless of the value of E. If C is the smallest positive number representable 
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in the computer, and keeping in mind that s is positive, then any value of s less than 
C/2 will be called 0 by the computer. To find out the smallest value of E for which this 
happens, simply solve the following equation for E, using the given value to = 128: 

1 + [m/(m - 1)] E < C/2 ' 

Because the function is symmetric about to, the resulting value of E will yield s = 1 
for r > to + 1. 



Figure P3.2 


Problem 3.3 


The transformations required to produce the individual bit planes are nothing more than 
mappings of the truth table for eight binary variables. In this truth table, the values of 
the 7th bit are 0 for byte values 0 to 127, and 1 for byte values 128 to 255, thus giving 
the transformation mentioned in the problem statement. Note that the given transformed 
values of either 0 or 255 simply indicate a binary image for the 7th bit plane. Any other 
two values would have been equally valid, though less conventional. 
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Problem 3.4 


Problem 3.5 


Problem 3.6 


Continuing with the truth table concept, the transformation required to produce an image 
of the 6th bit plane outputs a 0 for byte values in the range [0, 63], a 1 for byte values in 
the range [64, 127], a 0 for byte values in the range [128, 191], and a 1 for byte values 
in the range [192, 255], Similarly, the transformation for the 5th bit plane alternates 
between eight ranges of byte values, the transformation for the 4th bit plane alternates 
between 16 ranges, and so on. Finally, the output of the transformation for the 0th bit 
plane alternates between 0 and 255 depending as the byte values are even or odd. Thus, 
this transformation alternates between 128 byte value ranges, which explains why an 
image of the 0th bit plane is usually the busiest looking of all the bit plane images. 


(a) The number of pixels having different gray level values would decrease, thus causing 
the number of components in the histogram to decrease. Since the number of pixels 
would not change, this would cause the height some of the remaining histogram peaks 
to increase in general. Typically, less variability in gray level values will reduce contrast. 

(b) The most visible effect would be significant darkening of the image. For example, 
dropping the highest bit would limit to 127 the brightest level in an 8-bit image. Since 
the number of pixels would remain constant, the height of some of the histogram peaks 
would increase. The general shape of the histogram would now be taller and narrower, 
with no histogram components being located past 127. 


All that histogram equalization does is remap histogram components on the intensity 
scale. To obtain a uniform (flat) histogram would require in general that pixel intensities 
be actually redistributed so that there are L groups of n/L pixels with the same intensity, 
where L is the number of allowed discrete intensity levels and n is the total number of 
pixels in the input image. The histogram equalization method has no provisions for this 
type of (artificial) redistribution process. 


Let n be the total number of pixels and let n rj be the number of pixels in the input image 


http://librosysolucionarios.net 



20 Chapter 3 Problem Solutions 


with intensity value Tj . Then, the histogram equalization transformation is 

k i k 

Sk = T(r k ) = y j n r . jn = - 'y' n Tj . 

3 = 0 " 3=0 


Since every pixel (and no others) with value jp- is mapped to value Sk, it follows that 
n Sk = n Vk . A second pass of histogram equalization would produce values Vk according 
to the transformation 

1 . , 

v k = T(s k ) = -J2 n °y 

3=0 


But, n s = n r . , so 

1 k 

Vk = T(sk) = — /n ri = s k 
n z ' 

3=0 

which shows that a second pass of histogram equalization would yield the same result 
as the first pass. We have assumed negligible round-off errors. 


Problem 3.7 


The general histogram equalization transformation function is 

r 

s = T(r) = J Pr(w ) dw. 
o 

There are two important points to which the student must show awareness in answer- 
ing this problem. First, this equation assumes only positive values for r. However, the 
Gaussian density extends in general from — oo to oo. Recognition of this fact is impor- 
tant. Once recognized, the student can approach this difficulty in several ways. One 
good answer is to make some assumption, such as the standard deviation being small 
enough so that the area of the curve under p r (r) for negative values of r is negligible. 
Another is to scale up the values until the area under the negative tail is negligible. The 
second major point is to recognize is that the transformation function itself, 

s = T(r) = , / e 2 ct- dw 

o 

has no closed-form solution. This is the cumulative distribution function of the Gaussian 
density, which is either integrated numerically, or its values are looked up in a table. A 
third, less important point, that the student should address is the high-end values of r. 
Again, the Gaussian PDF extends to +oo. One possibility here is to make the same 


http://librosysolucionarios.net 



Problem 3.8 21 


assumption as above regarding the standard deviation. Another is to divide by a large 
enough value so that the area under the positive tail past that point is negligible (this 
scaling reduces the standard deviation). 

Another principal approach the student can take is to work with histograms, in which 
case the transformation function would be in the form of a summation. The issue 
of negative and high positive values must still be addressed, and the possible answers 
suggested above regarding these issues still apply. The student needs to indicate that 
the histogram is obtained by sampling the continuous function, so some mention should 
be made regarding the number of samples (bits) used. The most likely answer is 8 bits, 
in which case the student needs to address the scaling of the function so that the range 
is [0, 255]. 


Problem 3.8 


We are interested in just one example in order to satisfy the statement of the problem. 
Consider the probability density function shown in Fig. P3.8(a). A plot of the trans- 
formation T(r) in Eq. (3.3-4) using this particular density function is shown in Fig. 
P3.8(b). Because p r (r) is a probability density function we know from the discussion 
in Section 3.3.1 that the transformation T(r) satisfies conditions (a) and (b) stated in 
that section. Flowever, we see from Fig. P3.8(b) that the inverse transformation from s 
back to r is not single valued, as there are an infinite number of possible mappings from 
s = 1/2 back to r. It is important to note that the reason the inverse transformation 
function turned out not to be single valued is the gap in p r (r) in the interval [1/4, 3/4]. 


Problem 3.9 


(a) We need to show that the transformation function in Eq. (3.3-8) is monotonic, single- 
valued, and that its values are in the range [0, 1]. From Eq. (3.3-8), 

k 

Sk = T(r k ) =^2 Pr (rj) 

3=0 

= J2— k = 0,l,...,L-l. 

' n 

3=0 

Because all the p r (rj) are positive, it follows that T{r k ) is monotonic. Because all the 
Pr (r j ) are finite, and the limit of summation is finite, it follows that T(r k ) is of finite 
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slope and thus us a single- valued function. Finally, since the sum of all the p r (r : j) is 1, 
it follows that 0 < Sk < 1. 


p/r) 



0 1/4 1/2 3/4 1 


(l>) 

Figure P3.8. 

(b) From the discussion in Problem 3.8, it follows that if an image has missing gray 
levels the histogram equalization transformation function given above will be constant 
in the interval of the missing gray levels. Thus, in theory, the inverse mapping will 
not be single-valued in the discrete case either. In practice, assuming that we wanted 
to perform the inverse transformation, this is not important for the following reason: 
Assume that no gray-level values exist in the open interval (a, b), so that r a is the last 
gray level before the empty gray-level band begins and r h is the first gray level right after 
the empty band ends. The corresponding mapped gray levels are s a and Sb- The fact 
that no gray levels r exist in interval (a, b) means that no gray levels will exist between 
s a and Sb either, and, therefore, there will be no levels s to map back to r in the bands 
where the multi-valued inverse function would present problems. Thus, in practice, the 
issue of the inverse not being single -valued is not an issue since it would not be needed. 
Note that mapping back from s a and Sb presents no problems, since T(r a ) and T(r&) 
(and thus their inverses) are different. A similar discussion applies if there are more than 
one band empty of gray levels. 
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(c) If none of the gray levels r k , k - 1, 2, . . . , L — 1, are 0, then T{r k ) will be strictly 
monotonic. This implies that the inverse transformation will be of finite slope and this 
will be single-valued. 


Problem 3.10 


First, we obtain the histogram equalization transformation: 

r r 

s = T(r) = [ p r (w) dw = [ (—2 w + 2 )dw = —r 2 + 2 r. 


Next we find 


Finally, 


2 2 

= G(z) = J p z {w) dw = J 2w dw = 


z = G 1 (u) = ±y/v. 


z 2 . 


But only positive gray levels are allowed, so z — y/v. Then, we replace v with s, which 
in turn is -r 2 + 2 r, and we have 

z = \/—r 2 + 2 r. 


Problem 3.11 


The value of the histogram component corresponding to the kxh intensity level in a neigh- 
borhood is 

, , n k 
n 

for k = 1,2 ,K — 1, where n k is the number of pixels having gray level value r k , n 
is the total number of pixels in the neighborhood, and K is the total number of possible 
gray levels. Suppose that the neighborhood is moved one pixel to the right. This deletes 
the leftmost column and introduces a new column on the right. The updated histogram 
then becomes 

p'r{r k ) = ~[n k - n Lk + n Rk ] 

for />' 0 . 1 ...., K - 1 , where n kk is the number of occurrences of level r k on the left 

column and rip ll: is the similar quantity on the right column. The preceding equation can 
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be written also as 

Pr( r k) = Pr{r k ) + -\n Rk - n L k \ 

for A: = 0, 1, . . . , K — 1. The same concept applies to other modes of neighborhood 
motion: 

p' r {r k ) = p r (r k ) + ~[b k - o fe ] 
n 

for k = 0, 1, . . . , K — 1, where is the number of pixels with value in the neighbor- 
hood area deleted by the move, and bk is the corresponding number introduced by the 
move. 


Problem 3.12 


The purpose of this simple problem is to make the student think of the meaning of his- 
tograms and arrive at the conclusion that histograms carry no information about spatial 
properties of images. Thus, the only time that the histogram of the images formed by 
the operations shown in the problem statement can be determined in terms of the orig- 
inal histograms is when one or both of the images is (are) constant. In (d) we have 
the additional requirement that none of the pixels of g(x, y ) can be 0. Assume for 
convenience that the histograms are not normalized, so that, for example, hf(r k ) is the 
number of pixels in f(x, y ) having gray level r k , assume that all the pixels in g(x, y) 
have constant value c. The pixels of both images are assumed to be positive. Finally, 
let u k denote the gray levels of the pixels of the images formed by any of the arithmetic 
operations given in the problem statement. Under the preceding set of conditions, the 
histograms are determined as follows: 

(a) The histogram h mm (u k ) of the sum is obtained by letting u k = r k +c, and h sam (u k ) = 
hf (r /,. ) for all k. In other words, the values (height) of the components of h mm are the 
same as the components of hf, but their locations on the gray axis are shifted right by 
an amount c. 

(b) Similarly, the histogram hdis(u k ) of the difference has the same components as hf 
but their locations are moved left by an amount c as a result of the subtraction operation. 

(c) Following the same reasoning, the values (heights) of the components of histogram 
/tprod(wfc) of the product are the same as hf, but their locations are at u k = c x r k . Note 
that while the spacing between components of the resulting histograms in (a) and (b) 
was not affected, the spacing between components of // pmd (iq, : ) will be spread out by an 
amount c. 
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(d) Finally, assuming that c / 0, the components of /idiv(«/,j are the same as those of 
hf, but their locations will be at iif- = Vk/c. Thus, the spacing between components of 
hn v(wfe) will be compressed by an amount equal to 1/c. 

The preceding solutions are applicable if image f(x, y) also is constant. In this case 
the four histograms just discussed would each have only one component. Their location 
would be affected as described (a) through (c). 


Problem 3.13 


U sing 1 0 bits (with one bit being the sign bit ) allows numbers in the range — 511 to 511. 
The process of repeated subtractions can be expressed as 

K 

d K (x,y ) = a(x,y) -^2b(x,y) 

k = 1 

= a(x, y)- Kx b(x, y ) 

where K is the largest value such that <1k(x, y) does not exceed —511 at any coordinates 
(a:, y), at which time the subtraction process stops. We know nothing about the images, 
only that both have values ranging from 0 to 255. Therefore, all we can determine are 
the maximum and minimum number of times that the subtraction can be carried out and 
the possible range of gray-level values in each of these two situations. 

Because it is given that g(x, y) has at least one pixel valued 255, the maximum value 
that K can have before the subtraction exceeds —511 is 3. This condition occurs when, 
at some pair of coordinates (s, i), a(s,t) = b(s,t) = 255. In this case, the possible 
range of values in the difference image is -5 10 to 255. The latter condition can occur if, 
at some pair of coordinates a(i,j) = 255 and b(i,j ) = 0. 

The minimum value that K will have is 2, which occurs when, at some pair of coordi- 
nates, a(s, t ) = 0 and h(s, t) = 255. In this case, the possible range of values in the 
difference image again is —510 to 255. The latter condition can occur if, at some pair 
of coordinates (i,j), a(i,j ) = 255 and b(i,j ) = 0. 


Problem 3.14 


Let g(x,y) denote the golden image, and let / (x, y) denote any input image acquired 
during routine operation of the system. Change detection via subtraction is based on 
computing the simple difference d(x,y) = g(x,y) — f(x,y). The resulting image 
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d(x, y ) can be used in two fundamental ways for change detection. One way is use a 
pixel -by-pixel analysis. In this case we say that f(x,y) is ’’close enough” to the golden 
image if all the pixels in d(x,y) fall within a specified threshold band [T m i n ,T max ] 
where T rn i n is negative and T max is positive. Usually, the same value of threshold is 
used for both negative and positive differences, in which case we have a band [— T, T] 
in which all pixels of d(x, y) must fall in order for f(x. y) to be declared acceptable. 
The second major approach is simply to sum all the pixels in \d(x. y ) and compare the 
sum against a threshold S. Note that the absolute value needs to be used to avoid errors 
cancelling out. This is a much cruder test, so we will concentrate on the first approach. 

There are three fundamental factors that need tight control for difference-based inspec- 
tion to work: ( 1 ) proper registration, (2) controlled illumination, and ( 3) noise levels 
that are low enough so that difference values are not affected appreciably by variations 
due to noise. The first condition basically addresses the requirement that comparisons 
be made between corresponding pixels. Two images can be identical, but if they are 
displaced with respect to each other, comparing the differences between them makes 
no sense. Often, special markings are manufactured into the product for mechanical or 
image-based alignment 

Controlled illumination (note that “illumination” is not limited to visible light) obviously 
is important because changes in illumination can affect dramatically the values in a 
difference image. One approach often used in conjunction with illumination control is 
intensity scaling based on actual conditions. For example, the products could have one 
or more small patches of a tightly controlled color, and the intensity (and perhaps even 
color) of each pixels in the entire image would be modified based on the actual versus 
expected intensity and/or color of the patches in the image being processed. 

Finally, the noise content of a difference image needs to be low enough so that it does 
not materially affect comparisons between the golden and input images. Good signal 
strength goes a long way toward reducing the effects of noise. Another (sometimes 
complementary) approach is to implement image processing techniques (e.g., image 
averaging) to reduce noise. 

Obviously there are a number if variations of the basic theme just described. For exam- 
ple, additional intelligence in the form of tests that are more sophisticated than pixel-by- 
pixel threshold comparisons can be implemented. A technique often used in this regard 
is to subdivide the golden image into different regions and perform different (usually 
more than one) tests in each of the regions, based on expected region content. 
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Problem 3.15 


(a) From Eq. (3.4-3), at any point (x, y), 

i K i K i K 

2 = 1 2 = 1 2=1 

Then 

2 = 1 2=1 

But all the f, are the same image, so E { f , } = f. Also, it is given that the noise has 
zero mean, so E{y 4 } = 0. Thus, it follows that E{g} = /, which proves the validity of 


Eq. (3.4-4). 

(b) From (a), 

1 K 1 K i K 

9=K^ 9i = K^ fi + K^ r,i - 

2 = 1 2 = 1 2=1 

It is known from random-variable theory that the variance of the sum of uncorrelated 
random variables is the sum of the variances of those variables (Papoulis [1991]). Since 
the elements of / are constant and the t) i are uncorrelated, then 



The first term on the right side is 0 because the elements of / are constants. The various 
(T~ are simply samples of the noise, which is has variance Thus, a~ = a'~ and we 
have 

2 _ 2 _ 2 
a 9 J±2 Cr 'n K a v 

which proves the validity of Eq. (3.4-5). 


Problem 3.16 


With reference to Section 3.4.2, when i = 1 (no averaging), we have 

Vi 1) = 9i and ^(i) = <*%■ 


When i = K, 


g{K) 



2=1 


and 


T 9(K) 


K 


2 

iy 
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We want the ratio of cr | , K •> to to be 1/10, so 

g l(A-) = j_ = 

(T|(l) 10 <7 2 

from which we get K = 10. Since the images are generated at 30 frames/s, the station- 
ary time required is 1/3 s. 


Problem 3.17 


(a) Consider a 3 x 3 mask first. Since all the coefficients are 1 (we are ignoring the 1/9 
scale factor), the net effect of the lowpass filter operation is to add all the gray levels of 
pixels under the mask. Initially, it takes 8 additions to produce the response of the mask. 
However, when the mask moves one pixel location to the right, it picks up only one new 
column. The new response can be computed as 

-^new — f/old Cl T C3 

where 6) is the sum of pixels under the first column of the mask before it was moved, 
and C 3 is the similar sum in the column it picked up after it moved. This is the basic 
box-filter or moving-average equation. For a 3 x 3 mask it takes 2 additions to get C3 
(Ci was already computed). To this we add one subtraction and one addition to get 
f? n ew Thus, a total of 4 arithmetic operations are needed to update the response after 
one move. This is a recursive procedure for moving from left to right along one row of 
the image. When we get to the end of a row, we move down one pixel (the nature of the 
computation is the same) and continue the scan in the opposite direction. 

For a mask of size n x n, (n — 1) additions are needed to obtain C3, plus the single 
subtraction and addition needed to obtain f? new , which gives a total of (n + 1) arith- 
metic operations after each move. A brute-force implementation would require n 2 — 1 
additions after each move. 


(b) The computational advantage is 

n 2 — 1 

A = 


( n + 1 )(n - 1) 


n — 1. 


n+ 1 (n+1) 

The plot of A as a function of n is a simple linear function starting at A = 1 for n = 2. 


Problem 3.18 


One of the easiest ways to look at repeated applications of a spatial filter is to use super- 
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position. Let f(x, y) and h(x, y ) denote the image and the filter function, respectively. 
Assuming square images of size N x N for convenience, we can express /( x, y) as the 
sum of at most N 2 images, each of which has only one nonzero pixel (initially, we as- 
sume that N can be infinite). Then, the process of running h(x, y) over f(x, y) can be 
expressed as the following convolution: 

h(x,y) * f(x,y) = h(x,y ) * [fi(x,y) + f 2 (x,y) H f N ^{x,y)\ ■ 

Suppose for illustrative purposes that fi(x,y) has value 1 at its center, while the other 
pixels are valued 0, as discussed above (see Fig. P3.18a). If h(x. y) is a 3 x 3 mask of 
1/9’s (Fig. P3.18b), then convolving h(x, y) with fi(x, y) will produce an image with a 
3x3 array of 1/9’s at its center and 0’s elsewhere, as shown in Fig. P3. 18(c). If h(x. y) 
is now applied to this image, the resulting image will be as shown in Fig. P3. 18(d). 
Note that the sum of the nonzero pixels in both Figs. P3. 18(c) and (d) is the same, and 
equal to the value of the original pixel. Thus, it is intuitively evident that successive 
applications of h(x. y) will ’’diffuse” the nonzero value of fi(x, y) (not an unexpected 
result, because h(x. y) is a blurring filter). Since the sum remains constant, the values 
of the nonzero elements will become smaller and smaller, as the number of applications 
of the filter increases. The overall result is given by adding all the convolved fk(x, y), 
for k = 1, 2, .... N 2 . The net effect of successive applications of the lowpass spatial 
filter h (x. y) is thus seen to be more and more blurring, with the value of each pixel 
’’redistributed” among the others. The average value of the blurred image will be thus 
be the same as the average value of /( x, y). 

It is noted that every iteration of blurring further diffuses the values outwardly from the 
starting point. In the limit, the values would get infinitely small, but, because the average 
value remains constant, this would require an image of infinite spatial proportions. It is 
at this junction that border conditions become important. Although it is not required 
in the problem statement, it is instructive to discuss in class the effect of successive 
applications of h(x, y) to an image of finite proportions. The net effect is that, since the 
values cannot diffuse outward past the boundary of the image, the denominator in the 
successive applications of averaging eventually overpowers the pixel values, driving the 
image to zero in the limit. A simple example of this is given in Fig. P3. 18(e), which 
shows an array of size 1x7 that is blurred by successive applications of the 1x3 mask 
h(y) = -r [1 - 1, 1]. We see that, as long as the values of the blurred 1 can diffuse out, the 
sum, S, of the resulting pixels is 1 . However, when the boundary is met, an assumption 
must be made regarding how mask operations on the border are treated. Here, we used 
the commonly made assumption that pixel value immediately past the boundary are 0. 
The mask operation does not go beyond the boundary, however. In this example, we 
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see that the sum of the pixel values begins to decrease with successive applications of 
the mask. In the limit, the term 1/(3)" would overpower the sum of the pixel values, 
yielding an array of 0’s. 
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(b) 
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■ =1 

□ = 1/9 
S =9/81 

aii=6/8i 
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B =3/81 

□ =2/81 
■ =1/81 


Figure P3.18 


Problem 3.19 


(a) There are n 2 points in an n x n median filter mask. Since n is odd, the median 
value, £, is such that there are (n 2 — l)/2 points with values less than or equal to ( 
and the same number with values greater than or equal to £. However, since the area 
A (number of points) in the cluster is less than one half n 2 , and A and n are integers, 
it follows that A is always less than or equal to (n 2 — l)/2. Thus, even in the extreme 
case when all cluster points are encompassed by the filter mask, there are not enough 
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points in the cluster for any of them to be equal to the value of the median (remember, 
we are assuming that all cluster points are lighter or darker than the background points). 
Therefore, if the center point in the mask is a cluster point, it will be set to the median 
value, which is a background shade, and thus it will be “eliminated” from the cluster. 
This conclusion obviously applies to the less extreme case when the number of cluster 
points encompassed by the mask is less than the maximum size of the cluster. 

(b) For the conclusion reached in (a) to hold, the number of points that we consider 
cluster (object) points can never exceed (n 2 — l)/2. Thus, two or more different clusters 
cannot be in close enough proximity for the filter mask to encompass points from more 
than one cluster at any mask position. It then follows that no two points from different 
clusters can be closer than the diagonal dimension of the mask minus one cell (which 
can be occupied by a point from one of the clusters). Assuming a grid spacing of 1 unit, 
the minimum distance between any two points of different clusters then must greater 
than \/2{n — 1). In other words, these points must be separated by at least the distance 
spanned by n — 1 cells along the mask diagonal. 


Problem 3.20 


(a) Numerically sort the n 2 values. The median is 

C = [(n 2 + l)/2]-th largest value. 

(b) Once the values have been sorted one time, we simply delete the values in the trailing 
edge of the neighborhood and insert the values in the leading edge in the appropriate 
locations in the sorted array. 


Problem 3.21 


(a) The most extreme case is when the mask is positioned on the center pixel of a 3-pixel 
gap, along a thin segment, in which case a 3 x 3 mask would encompass a completely 
blank field. Since this is known to be the largest gap, the next (odd) mask size up is 
guaranteed to encompass some of the pixels in the segment. Thus, the smallest mask 
that will do the job is a 5 x 5 averaging mask. 

(b) The smallest average value produced by the mask is when it encompasses only two 
pixels of the segment. This average value is a gray-scale value, not binary, like the rest 
of the segment pixels. Denote the smallest average value by Amin, and the binary values 
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of pixels in the thin segment by B. Clearly, /t mln is less than B. Then, setting the 
binarizing threshold slightly smaller than A m i n will create one binary pixel of value B 
in the center of the mask. 


Problem 3.22 


From Fig. 3.35, the vertical bars are 5 pixels wide, 100 pixels high, and their separation 
is 20 pixels. The phenomenon in question is related to the horizontal separation between 
bars, so we can simplify the problem by considering a single scan line through the bars 
in the image. The key to answering this question lies in the fact that the distance (in 
pixels) between the onset of one bar and the onset of the next one (say, to its right) is 25 
pixels. Consider the scan line shown in Fig. P3.22. Also shown is a cross section of a 
25 x 25 mask. The response of the mask is the average of the pixels that it encompasses. 
We note that when the mask moves one pixel to the right, it loses on value of the vertical 
bar on the left, but it picks up an identical one on the right, so the response doesn’t 
change. In fact, the number of pixels belonging to the vertical bars and contained 
within the mask does not change, regardless of where the mask is located (as long as it 
is contained within the bars, and not near the edges of the set of bars). The fact that the 
number of bar pixels under the mask does not change is due to the peculiar separation 
between bars and the width of the lines in relation to the 25-pixel width of the mask 
This constant response is the reason no white gaps is seen in the image shown in the 
problem statement. Note that this constant response does not happen with the 23 x 23 
or the 45 x 45 masks because they are not ’’synchronized” with the width of the bars and 
their separation. 


Mask response 

I ^ Center of 
mask 


25 pixels 



Figure P3.22 
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Problem 3.23 


There are at most q 2 points in the area for which we want to reduce the gray level of 
each pixel to one-tenth its original value. Consider an averaging mask of size n x n 
encompassing the q x q neighborhood. The averaging mask has n 2 points of which we 
are assuming that q 2 points are from the object and the rest from the background. Note 
that this assumption implies separation between objects at least the area of the mask all 
around each object. The problem becomes intractable unless this assumption is made. 
This condition was not given in the problem statement on purpose in order to force the 
student to arrive at that conclusion. If the instructor wishes to simplify the problem, this 
should then be mentioned when the problem is assigned. A further simplification is to 
tell the students that the gray level of the background is 0. 


Let D represent the gray level of background pixels, let a, denote the gray levels of 
points inside the mask and o.; the levels of the objects. In addition, let ,S'„ denote the 
set of points in the averaging mask, S a the set of points in the object, and Si, the set of 
points in the mask that are not object points. Then, the response of the averaging mask 
at any point on the image can be written as 


R = 


1 


£« 

dieSa 

£ 

Ox £ S 0 


ak 

Q>k 


rj2 

d ojeSo 


Z / 

.a,k€.Sb 


ak 


= \ [{n 2 - q 2 )B] 

n z n- L J 

where Q denotes the average value of object points. Let the maximum expected average 
value of object points be denoted by Q lnax . Then we want the response of the mask at 
any point on the object under this maximum condition to be less than one-tenth Q max , 
or 


rQn 


1 


n * 


[(r 


from which we get the requirement 


n > q 


1 2 ) B ) < 


10(Q max - B) 


,1/2 


L(Q max -10B)J 

for the minimum size of the averaging mask. Note that if the background gray-level is 
0, we the minimum mask size is n < \/W q. If this was a fact specified by the instructor. 
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or student made this assumption from the beginning, then this answer follows almost by 
inspection. 


Problem 3.24 


The student should realize that both the Laplacian and the averaging process are linear 
operations, so it makes no difference which one is applied first. 


Problem 3.25 


The Laplacian operator is defined as 

V 2 / 

for the unrotated coordinates and as 

V 2 / = 

for rotated coordinates. It is given that 


dx 2 


&f_ 

dx' 2 


& £ 

dy 2 

dy' 2 


x = x' cos 9 — y' sin 9 and y = x' sin 9 + y' cos 9 

where 9 is the angle of rotation. We want to show that the right sides of the first two 

equations are equal. We start with 

0£ 

dx' 

Taking the partial derivative of this expression again with respect to x' yields 


df dx df dy 

dx dx’ dy dx’ 


d£ 

dx 


cost 


d£. 

dy 


smt 


&i = CPl 2 

dx’ 2 dx 2 cos 


d_ 

dx 



cos 9 


d_ 

dy 



sin 9 



9. 


Next, we compute 


21 

dy' 


df dx 
dx dy' 

df 


+ 


sin# 


df dy 
dy dy' 

df 


cos 9. 


dx dy 

Taking the derivative of this expression again with respect to y' gives 


&l = &l . 2 Q _d_ 

dy' 2 dx 2 s dx 

Adding the two expressions 


dj_ 

dy 


cos v sin f 


d_ 

dy 


2i 

dx 


sin 9 cos 9 


dy- 


■ cos 


for the second derivatives yields 


d 2 f d^£_d^l 
dx' 2 + dy' 2 dx 2 + dy 2 
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which proves that the Laplacian operator is independent of rotation. 


Problem 3.26 


Unsharp masking is high-boost filtering [Eq. (3.7-1 1)] with A = 1. Figure P3.26 shows 
the two possible solutions based on that equation. The left and right masks correspond 
to the first and second line in the equation, respectively. 



Problem 3.26. 


Problem 3.27 


Consider the following equation: 

f(x,y) -V 2 f(x,y) = f(x,y) — [f(x + l,y) + f(x — l,y) + f(x,y + 1) 

+f(x,y~ 1) - 4 f(x,y)] 

= 6 f(x, y) - [f(x + l,y) + f(x - 1, y) + f(x, y+ 1) 

+f(x,y-l) + f(x, y)} 

= 5(1.2 f(x,y)~ 

jr [f(x + 1, y) + f(x — 1 , 2 /) + f(x, y + 1) 

+f(x, y - 1) + f(x, y)]} 

= 5[l.2f(x,y)-J(x,y)] 

where f(x, y) denotes the average of f(x, y) in a predefined neighborhood that is cen- 
tered at (x, y) and includes the center pixel and its four immediate neighbors. Treating 
the constants in the last line of the above equation as proportionality factors, we may 
write 


f(x, y ) - V 2 /(®, y ) ~ f(x, y ) - f(x, y). 

The right side of this equation is recognized as the definition of unsharp masking given 
in Eq. (3.7-7). Thus, it has been demonstrated that subtracting the Laplacian from an 
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image is proportional to unsharp masking. 


Problem 3.28 


(a) From Problem 3.25, 


and 


df df . df . 

7-7 = 4 c °s 0 + sin ( 
dx' ox dy 


from which it follows that 

os_ 

dx' 


01 

dx' 


0£ 

Oy' 


ox oy 


oiy = (oj_ 

dy' ) \dx 


01 

dy’ 


1/2 


01 

dx 


01 

Oy 

01 

Oy 


-I 1/2 


Thus, we see that the magnitude of the gradient is an isotropic operator. 


(b) From Eq. (3.7-12), (3.7-14) and the preceding results. 


\G X \ = 


qi 

dx 


\Gy\ = 


0J_ 

Oy 


and 


\G X .\ = 

(■4 


Of 


dx' 



Of 


Oy' 



d 4 cos e- 

OX 

ox 


oi. 

Oy 


suit 


a J- COSO 

Oy 


Clearly, \G X ’\ + |Gy | ^ \G X \ + \G y \. 


Problem 3.29 


It is given that the range of illumination stays in the linear portion of the camera response 
range, but no values for the range are given. The fact that images stay in the linear 
range simply says that images will not be saturated at the high end or be driven in the 
low end to such an extent that the camera will not be able to respond, thus losing image 
information irretrievably. The only way to establish a benchmark value for illumination 


http://librosysolucionarios.net 



Problem 3.28 37 


is when the variable (daylight) illumination is not present. Let fo(x, y ) denote an image 
taken under artificial illumination only, with no moving objects (e.g., people or vehicles) 
in the scene. This becomes the standard by which all other images will be normalized. 
There are numerous ways to solve this problem, but the student must show awareness 
that areas in the image likely to change due to moving objects should be excluded from 
the illumination-correction approach. 


One simple way is to select various representative subareas of fo(x, y) not likely to 
be obscured by moving objects and compute their average intensities. We then select 
the minimum and maximum of all the individual average values, denoted by, / min and 
/ max . The objective then is to process any input image, / (x, y), so that its minimum 
and maximum will be equal to / min and / max , respectively. The easiest way to do this 
is with a linear transformation function of the form 


/out (a;, y) = af(x,y) + b. 

where / out is the output image. It is easily verified that the output image will have the 
required minimum and maximum values if we choose 

/ max / min 


a = 


and 


b = 


/max /min 
/min /max / max/ 


/max /min 

where / max and f m ; n are the maximum and minimum values of the input image. 


Note that the key assumption behind this method is that all images stay within the linear 
operating range of the camera, thus saturation and other nonlinearities are not an issue. 
Another implicit assumption is that moving objects comprise a relatively small area in 
the field of view of the camera, otherwise these objects would overpower the scene and 
the values obtained from fo(x,y) would not make a lot of sense. If the student selects 
another automated approach (e.g., histogram equalization), he/she must discuss the same 
or similar types of assumptions. 
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Problem 4.1 


By direct substitution of f(x) [Eq. (4.2-6)] into F(u) [Eq. (4.2-5)]: 


M-l 


F <“> = »E 


M 

1 

M 

1 


a :— 0 
M-l 


M-l 


E F ( r ) ( 


J27vrx/M 


r—0 


0 —j2 , Kux/M 


M-l 


E f m E 


0 j 27vrx / M — j 2nux / M 


r—0 


x=0 


M 

= ^(«) 


F(u) [M] 


where the third step follows from the orthogonality condition given in the problem state- 
ment. Substitution of F(u) into f(x) is handled in a similar manner. 


Problem 4.2 


This is a simple problem to familiarize the student with just the manipulation of the 2-D 
Fourier transform and its inverse. The Fourier transform is linear iff: 

%[aifi(x,y) +a 2 f 2 (x,y)\ = aiS [fi(x,y)] + a 2 S [h{x,y)] 

where al and a2 are arbitrary constants. From the definition of the 2-D transform, 

M-l iV-1 

EE [aifi(x,y) + a 2 f 2 {x,y)\ 


%[aifi(x,y) + a 2 f 2 {x,y)\ = 


1 


MN 

x—0 y—0 

e ~j2ir(ux/M + vy/N) 

1 M-1N-1 

MX Y E«iA {x,y)e- j2 * ux/M + vv/N > 

x—0 y—0 
M-l N—l 

E Y. a ^yy) e ~ j2<ux/M+vv/N) 


i 


MN 

x—0 y—0 

ai$s[fi(x,y)]+a 2 Q[f 2 (x, y)\ 
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Problem 4.3 


Problem 4.4 


which proves linearity. The inverse is done in the same way. 


The inverse DFT of a constant A in the frequency domain is an impulse of strength A in 
the spatial domain. Convolving the impulse with the image copies (multiplies) the value 
of the impulse at each pixel location in the image. 


An important aspect of this problem is to recognize that the quantity ( u 2 + v 2 ) can 
be replaced by the distance squared, D 2 (u,v). This reduces the problem to one vari- 
able, which is notationally easier to manage. Rather than carry an award capital letter 
throughout the development, we define to 2 = D 2 (u , v) = ( u 2 + v 2 ). Then we proceed 
as follows: 

H(w) = e~ w2/2a2 . 

The inverse Fourier transform is 

/ OO 

H(w)e j2 ™ z dw 

-OO 


/: 


e -w 2 /2a 2 e j2^wz dw 

e~5 M w2 ~^ 2wz ]dw. 


We now make use of the identity 


(2tv) 2 z 2 ct 2 (2tt) 2 z 2 cr 2 


1. 


Inserting this identity in the preceding integral yields 

\ 2 ,2 _2 r°° 


ll(z) = e - (2 - )2 /- 2 f°° e -^[n, 2 -j^a 2 W z-( 2 ^fa i Z 2 } dw 

J — OO 


= e 


-.ISzl 


2 ,2„2 r °° 




dw. 


Next we make the change of variable r = w — j2na 2 z. Then, dr = dw and the above 
integral becomes 


h(z) == e 


(2tt) 


2 ,2 _2 r °° 


e 2 <t 2 dr. 


Finally, we multiply and divide the right side of this equation by \/2^T^ 


7 ra: 


(27 r)±z±*± 


j yzTT ) i 

h(z) = V27T(7e 2 


V / 27 ra J- 


j: 


e 2 7? dr 
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The expression inside the brackets is recognized as a Gaussian probability density func- 
tion, whose integral from — oo to oo is 1 . Then, 

( r\ \*2, *2, *2, 

h(z) = V^wcre 2 . 

Going back to two spatial variables gives the final result:/r(a:, y) = \/2na e _27r ° ( x +v \ 


Problem 4.5 


The spatial filter is obtained by taking the inverse Fourier transform of the frequency- 
domain filter: 

hhp(x,y) = 9 _1 [1 - Hi p (u,v)] 

= 9 _1 [1] - 9 _1 [H lp (u,v)] 

= 5 ( 0 ) - V2 na e -^ 2 ° 2 (x 2 +v 2 ) 


Problem 4.6 


(a) We note first that (-l)^ = ^< x +v ) . Then, 

M—l N—l 


f(x,y)e^ x +y> 
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MN 
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EE [f(x,y)( 


,jir(x+y) 


x—0 y—0 
M—l N—l 


D —j2 , K(ux/M + vy/N ) 


tt( xM yN ) 

V 2 M 2 N > 


MN 
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MN 


Y Y [f( x >yy 

x—0 y—0 
^—j2n(ux/M + vy/N ) 

M—l N—l 

Y Y f(x,y)e~ j2 ' K ( x[u -^ ]/M+v[v -^ ]/N ) 


x—0 y—0 

F(u- M/2,v- N/2). 


(b) Following the same format as in (a). 


f{x, y) e F< u ^l M + l 'o y/ M ) 


1 


M — l A t 


MN 


Y Y [f( x ^y) ej2n 

x—0 y—0 
^—j2 , k{ux/M vy/N ) 

1 M—l N — l 

wt E Yf^y) 

x—0 y—0 

-j2Tr(x[u-u 0 \/M + y[v-v 0 \/N) 


( u 0 x/M + v 0 y/M) 


= F(u-u 0 ,v-v 0 ) 
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Similarly, 


9" 1 


F(u, v)e~^^ ux °! M + vy °/ M ) 


f(x- x 0 ,y-y 0 ). 


Problem 4.7 


The equally-spaced, vertical bars on the left, lower third of the image. 


Problem 4.8 


With reference to Eq. (4.4-1), all the highpass filters in discussed in Section 4.4 can be 
expressed a 1 minus the transfer function of lowpass filter (which we know do not have 
an impulse at the origin). The inverse Fourier transform of 1 gives an impulse at the 
origin in the highpass spatial filters. 


Problem 4.9 


The complex conjugate simply changes j to —j in the inverse transform, so the image 
on the right is given by 


M — 1 JV-1 


-Mr- („,„)] = F(u.v)( 


—j2iv{ux/M + vy/N ) 


x—0 y—0 


M— 1 TV— 1 

- ii -F(u.u)e 727r (“( _x )/ M + V (~ y ')/ N ) 

x—0 y—0 

= f(-x,-y ) 

which simply mirrors f{x,y) about the origin, thus producing the image on the right. 


Problem 4.10 


If H ( a. v) is real and symmetric, then 

H(u,v ) = H*(u,v ) = H*(— u,— v) = H(—u, —v). 

The filter in the spatial domain is 

M—l N - 1 

h{x, y) - 3" 1 [. H(u , v)] = Y, H(u.v)e j2 ^ ux/M + vy/N) . 

x—0 y—0 
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Then, 


h* (x,y) 


Similarly, 


h(—x, —y) 


M-l N - 1 


(M.u)e _:?27r (“ x / M + vy / N ^ 

x— 0 y=0 


M-l IV- 1 

H ^ (-«,-«)< 

x=0 y = 0 


j 2 tt(ux/ M -+- vy/N) 


M-l TV— 1 

E £ "(»• u)e j27r (“ x / M + ' uy ' /Jv ^ 


21—0 y — 0 


h(x,y) (real). 


M-l TV— 1 

EE H ( U) V ^ e ~j^{ux/M + vy/N) 
x—0 y—0 
M-l TV-1 

E E _ u ) e j27r(ux/M + vy/N) 

x—0 y—0 
M-l TV-1 

E E "<“■ j2v(ux/M + vy/N) 

x—0 y—0 

h(x, y ) (symmetric). 


Problem 4.11 


Starting from Eq. (4.2-30), we easily find the expression for the definition of continuous 
convolution in one dimension: 

/ OO 

f( a )g(x - a) da. 

-OO 

The Fourier transform of this expression is 


^ U(x) * g(x)\ = 


' — OO 

r oo 


f(a)g(x — a)da 


e -j2% ux dx 


/ OO pOO 

f(a ) / < 7 ( 2 : — a)e~ y2 ' KUX dx 

-oo LJ-OO 


da. 


The term inside the inner brackets is the Fourier transform — a). But, 

3 [g{x - a)] - G(u)e~ j2nua 
so 

/ OO 

/(a) [G{u)e~ j27Tua ] da 

-OO 

/ OO 

f(a)e~ J2 ™ a da 

-oo 

= G(u)F(u). 
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This proves that multiplication in the frequency domain is equal to convolution in the 
spatial domain. The proof that multiplication in the spatial domain is equal to convolu- 
tion in the spatial domain is done in similar way. 


Problem 4.12 


(a) The ring in fact has a dark center area as a result of the highpass operation only (the 
following image shows the result of highpass filtering only). However, the dark center 
area is averaged out by the lowpass filter. The reason the final result looks so bright is 
that the discontinuity (edge) on boundaries of the ring are much higher than anywhere 
else in the image, thus giving an averaged area whose gray level dominates. 

(b) Filtering with the Fourier transform is a linear process. The order does not matter. 



Figure P4.12 


Problem 4.13 


(a) One application of the filter gives: 


G(u, v) 


H(u, v)F(u, v ) 

e -D\u,v)/2Dlp^ v y 
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Similarly, K applications of the filter would give 

G k (u,v) = e- KD2(u ’ v)/2D2 °F(u,v). 

The inverse DFT of Gk (u, v ) would give the image resulting from K passes of the 
Gaussian filter. If K is “large enough,” the Gaussian LPF will become a notch pass 
filter, passing only F( 0, 0). We know that this term is equal to the average value of the 
image. So, there is a value of K after which the result of repeated lowpass filtering 
will simply produce a constant image. The value of all pixels on this image will be 
equal to the average value of the original image. Note that the answer applies even as 
K approaches infinity. In this case the filter will approach an impulse at the origin, and 
this would still give us F( 0, 0) as the result of filtering. 


(b) To guarantee the result in (a), K has to be chosen large enough so that the filter 
becomes a notch pass filter (at the origin) for all values of D(u,v). Keeping in mind 
that increments of frequencies are in unit values, this means 

H k (u, v) = e - KD2 M ' 2D o = { 1 if (U ’ U) = ( °’ 0) 

^ 0 Otherwise. 

Because u and v are integers, the conditions on the second line in this equation are 
satisfied for all u > 1 and/or v > 1. When u = v = 0, D(u, v) = 0, and Hk(u, v) = 1, 
as desired. 


We want all values of the filter to be zero for all values of the distance from the origin 
that are greater than 0 (i.e., for values of u and/or v greater than 0). However, the filter is 
a Gaussian function, so its value is always greater than 0 for all finite values of D(u, v). 
But, we are dealing with digital numbers, which will be designated as zero whenever 
the value of the filter is less than i the smallest positive number representable in the 
computer being used. Assume this number to be /,' mln (don’t confuse the meaning of this 
k with K, which is the number of applications of the filter). So, values of K for which 
for which the filter function is greater than 0.5 x will suffice. That is, we want the 
minimum value of K for which 

e -KD\u,v)/2Dl K Q 5kmin 

or 

ln(0.5fc min ) 

^ D*{u,v)/2Dl 

2 -Dp ln(0.5fc min ) 

D 2 (u, v) 

As noted above, we want this equation for hold for all values of D 2 (u, v) > 0. Since the 
exponential decreases as a function of increasing distance from the origin, we choose 
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the smallest possible value of D 2 ( u. v), which is 1. Tis gives the result 

K > -2 D 2 ln(0.5 fc min ) 

which gives a positive number because fc min << 1. This result guarantees that the 
lowpass filter will act as a notch pass filter, leaving only the value of the transform at the 
origin. The image will not change past this value of K. 


Problem 4.14 


(a) The spatial average is 

g(x , y) = \ [f(x, y + 1) + f(x + 1, y) + f(x - 1, y) + f(x, y - 1)] . 

From Eq. (4.6-2), 


G{u,v) = - L 


e j2irv/N _|_ e j2nu/M _j_ e ~j2nu/M _j_ e ~j2Trv/N 


F(u,v) 


H{u,v)F(u,v), 


where 

H(u,v) = — [cos(27 tu/M) +cos(2 nv/N)} 

is the filter transfer function in the frequency domain. 


(b) To see that this is a lowpass filter, it helps to express the preceding equation in the 
form of our familiar centered functions: 

H(u, v) = ^ [cos( 27 t[w — M/2)/M) + cos(27r[t> — N/2]/N)\ . 

Consider one variable for convenience. As u ranges from 0 to M, the value of cos(27r[w— 
M/2)/M ) starts at — 1, peaks at 1 when u = M/2 (the center of the filter) and then de- 
creases to — 1 again when u = M. Thus, we see that the amplitude of the filter decreases 
as a function of distance from the origin of the centered filter, which is the characteris- 
tic of a lowpass filter. A similar argument is easily carried out when considering both 
variables simultaneously. 


Problem 4.15 


The problem statement gives the form of the difference in the ;r-dircction. A similar 
expression gives the difference in the y-dircction. The filtered function in the spatial 
domain then is: 

g{x,y) = f(x,y) - f(x + l,y)+ f(x,y) - f(x,y+ 1). 
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From Eq. (4.6-2), 

G(u, v) = F(u, v ) - F(u, v)e j2nu/M + F(u, v) - F(u, v)e j2 ™ /N 
= [1 - e^ u/M }F{u, v) + [1 - e j2wv ' N ]F{u, v ) 

= H{u,v)F{u,v), 
where H (u, v ) is the filter function: 

H(u, v) = — 2j sm(nu/M)e : ’ nu / M + sm(nv/N)e : ’' KV ^ N . 

(b) To see that this is a highpass filter, it helps to express the filter function in the form 
of our familiar centered functions: 

H{u, v) = -2 j [sin(7r[w - M/2]/M)e>™ /M + sin(7r[t; - N/2]/N)e^ v,N . 

Consider one variable for convenience. As u ranges from 0 to M, H ( a. v) starts at 
its maximum (complex) value of 2 j for u — 0 and decreases from there. When a = 
M/2 (the center of the shifted function), A similar argument is easily carried out when 
considering both variables simultaneously.. The value of F[ (u. v ) starts increasing again 
and achieves the maximum value of 2 j again when u = M. Thus, this filter has a 
value of 0 a the origin and increases with increasing distance from the origin. This 
is the characteristic of a highpass filter. A similar argument is easily carried out when 
considering both variables simultaneously. 


Problem 4.16 

(a) The key for the student to be able to solve the problem is to treat the number of 
applications (denoted by K) of the highpass filter as 1 minus K applications of the 
corresponding lowpass filter, so that 

H k (u,v) = H k (u,v)F(u,v) 

= 1 — e ~ K D 2 (u,v)/2Dg 

where the Gaussian lowpass filter is from Problem 4.13. Students who start directly 
with the expression of the Gaussian highpass filter 1 — e~ KD l a ’ v )/‘ 2l) t> and attempt 
to raise it to the A'th power will run into a dead end. 

The solution to this problem parallels the solution to Problem 4.13. Here, however, 
the filter will approach a notch filter that will take out F( 0, 0) and thus will produce an 
image with zero average values (this implies negative pixels). So, there is a value of 
K after which the result of repeated highpass filtering will simply produce a constant 
image. 
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H k (u,v) = 1 - e -KD 2 (u,v)/2D 2 0 = 


(b) The problem is to determine the value of K for which 

0 if (u,v) = (0,0) 

1 Otherwise. 

V 

Because u and v are integers, the conditions on the second line in this equation are 
satisfied for all u > 1 and/or v > 1. When u = v = 0, D(u, v) = 0, and Hk(u, v) = 0, 
as desired. 


We want all values of the filter to be 1 for all values of the distance from the origin that 
are greater than 0 (i.e., for values of u and/or v greater than 0). For Hk(u, v) to become 
1, the exponential term has to become 0 for values of u and/or v greater than 0. This is 
the same requirement as in Problem 4.13, so the solution of that problem applies here as 

well. 


Problem 4.17 


(a) Express filtering as convolution to reduce all processes to the spatial domain. Then, 
the filtered image is given by 

g(x, y ) = h(x,y) * f(x,y ) 

where h is the spatial filter (inverse Fourier transform of the frequency-domain filter) 
and / is the input image. Histogram processing this result yields 

g'{x,y) = T [g(x,y)\ 

= T [h{x, y)* f{x,y)\, 

where T denotes the histogram equalization transformation. If we histogram-equalize 
first, then 

g(x,y) = T [f(x,y)] 

and 

g'{ x, y) = h(x, y)*T [f(x, y)\ . 

In general, T is a nonlinear function determined by the nature of the pixels in the im- 
age from which it is computed. Thus, in general, T [ h(x,y ) * f(x,y )] ^ h(x,y) * 
T [f(x, y)\ and the order does matter. 

(b) As indicated in Section 4.4, highpass filtering severely diminishes the contrast of 
an image. Although high-frequency emphasis helps some, the improvement is usually 
not dramatic (see Fig. 4.30). Thus, if an image is histogram equalized first, the gain 
in contrast improvement will essentially be lost in the filtering process. Therefore, the 
procedure in general is to filter first and histogram-equalize the image after that. 
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Problem 4.18 


Problem 4.19 


Problem 4.20 


The answer is no. The Fourier transform is a linear process, while the square and square 
roots involved in computing the gradient are nonlinear operations. The Fourier trans- 
form could be used to compute the derivatives (as differences — see Prob.4.15), but the 
squares, square root, or absolute values must be computed directly in the spatial domain. 


The equation corresponding to the mask in Fig. 4.27(f) is Eq. (3.7-4): 

g(x, y ) = [f(x +l,y)+f(x-l,y) + f(x, y + 1 ) + f(x, y - 1 )] - 4 f(x, y). 

As in Problem 4.15, 

G(u,v ) = H(u,v)F(u,v ) 


where 


H(u,v) = 


\ e j2iru/M e ~j2wu/M _|_ ^2-kv/N _|_ e -j2irv/N _ 

= 2 [cos(27tw/A7) + cos(2ttv/N) — 2] . 

Shifting the filter to the center of the frequency rectangle gives 

H{u,v) = 2 [cos(27r [u — M/2] /M) + cos(27r [v — N/ 2] /N) — 2] . 

When (u,v) = (M/2, N/2) (the center of the shifted filter). For values away from 
the center values of H(u, v) decrease, but this is as expected [see Fig. 4.27(a)] for this 
particular formulation of the Laplacian. 


From Eq. (4.4-3), the transfer function of a Butterworth highpass filter is 

H(u,v ) = — 


1 + 


Dn 


n 2 n ' 


D(u,v ) J 

We want the filter to have a value of when D(u, v) = 0, and approach -y H for high 
values of D(u,v). The preceding equation is easily modified to accomplish this: 

(7 h - 7 l) 


H(u,v) = 7l + 


1 + 


Do 


yD{u,v) 

The value of n controls the sharpness of the transition between and 7 H . 
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Problem 4.21 


Recall that the reason for padding is to establish a ’’buffer” between the periods that are 
implicit in the DFT. Imagine the image on the left being duplicated infinitely many times 
to cover the try-plane. The result would be a checkerboard, with each square being in 
the checkerboard being the image (and the black extensions). Now imagine doing the 
same thing to the image on the right. The results would be indistinguishable. Thus, 
either form of padding accomplishes the same separation between images, as desired. 


Problem 4.22 

(a) Padding an image with zeros increases its size, but not its gray-level content. Thus, 
the average gray-level of the padded image is lower than that of the original image. 
This implies that F((), 0) in the spectrum of the padded image is less than F(0, 0) in 
the original image (recall that F( 0, 0) is the average value of the corresponding image). 
Thus, we can visualize F((), 0) being lower in the spectrum on the right, with all values 
away from the origin being lower too, and covering a narrower range of values. That’s 
the reason the overall contrast is lower in the picture on the right. 

(b) Padding an image with 0’s introduces significant discontinuities at the borders of the 
original images. This process introduces strong horizontal and vertical edges, where 
the image ends abruptly and then continues with 0 values. These sharp transitions 
correspond to the strength of the spectrum along the horizontal and vertical axes of the 
spectrum. 

Problem 4.23 

As in problem 4.9, taking the complex conjugate of an image mirrors it in the spatial 
domain. Thus, we would expect the result to be a mirror image (about both axes) of 
Fig. 4.41(e). 

Problem 4.24 


(a) and (b) See Figs. P4. 24(a) and (b). (c) and (d) See Figs. P4. 24(c) and (d). 


http://librosysolucionarios.net 



Problem 4.25 51 



Problem 4.25 


Because M = 2 n , we can write Eqs. (4.6-47) and (4.6-48) respectively as 

m{n) = -Mn 


and 


a(n) = Mn. 

Proof by induction begins by showing that both equations hold for n = 1 : 
m(l) = i(2)(l) = l and a(l) = (2)(1) = 2. 

We know these results to be correct from the discussion in Section 4.6.6. Next, we 
assume that the equations hold for n. Then, we are required to prove that they also are 
true for n + 1. From Eq. (4.6-45), 

m{n + 1) = 2 m(n) + 2". 
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Substituting m(n) from above, 

m(n + 1) = 2 + 2" 

= 2 + 2 ” 

= 2"(n + l) 

= i (2” +1 ) (n + 1). 

Therefore, Eq. (4.6-47) is valid for all n. 

From Eq. (4.6-46), 

o(n + 1) = 2 a(n) + 2 n+1 . 

Substituting the above expression for a(n ) yields 

a(n + 1) = 2Mn + 2 n+1 

= 2(2"n) + 2 n+1 

= 2 n+1 (n + 1) 

which completes the proof. 

Problem 4.26 

Consider a single star modeled as an impulse S(x — x-o, y — t/o)- Then, 

/(*,y) = K6(x-x 0 ,y-y 0 ) 

from which 

z(x,y) = \nf(x,y) =\nK + ln6(x - x 0 ,y - y 0 ) 

= K' +8'(x- x 0 ,y-yo). 

Taking the Fourier transform of both sides yields 

3[^,y)] = 9 [A"] +9 [8'{x- x 0 ,y- y 0 )] 

= 8(0,0) + e- 2w ( UXo+vyo '>. 

From this result, it is evident that the contribution of illumination is an impulse at the 
origin of the frequency plane. A notch filter that attenuates only this component will 
take care of the problem. Extension of this development to multiple impulses (stars) is 
straightforward. The filter will be the same. 

Problem 4.27 

The problem can be solved by carrying out the following steps: 
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1 . Perform a median filtering operation. 

2. Follow (1) by high-frequency emphasis. 

3. Histogram-equalize this result. 

4. Compute the average gray level, A'o- Add the quantity (K — Kq) to all pixels. 

5. Perform the transformations shown in Fig. P4.27, where r is the input gray level, 
and R, G, and B are fed into an RGB color monitor. 

Output 


Input 

Figure P4.27 
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Problem 5.1 


The solutions to (a), (b), and (c) are shown in Fig. P5.1, from left to right: 



Figure P5.1 


Problem 5.2 


The solutions to (a), (b), and (c) are shown in Fig. P5.2, from left to right: 



Figure P5.2 
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Problem 5.3 


The solutions to (a), (b), and (c) are shown in Fig. P5.3, from left to right: 



Figure P5.3 


Problem 5.4 

The solutions to (a), (b), and (c) are shown in Fig. P5.4, from left to right: 



Figure P5.4 


Problem 5.5 


The solutions to (a), (b), and (c) are shown in Fig. P5.5, from left to right: 
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Figure P5.5 


Problem 5.6 


The solutions to (a), (b), and (c) are shown in Fig. P5.6, from left to right: 


Figure P5.6 


Problem 5.7 


The solutions to (a), (b), and (c) are shown in Fig. P5.7, from left to right: 



Figure P5.7 
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Problem 5.8 

The solutions to (a), (b), and (c) are shown in Fig. P5.8, from left to right: 



Figure P5.8 


Problem 5.9 

The solutions to (a), (b), and (c) are shown in Fig. P5.9, from left to right: 



Figure P5.9 


Problem 5.10 


(a) The key to this problem is that the geometric mean is zero whenever any pixel is 
zero. Draw a profile of an ideal edge with a few points valued 0 and a few points valued 
1. The geometric mean will give only values of 0 and 1, whereas the arithmetic mean 
will give intermediate values (blur). 

(b) Black is 0, so the geometric mean will return values of 0 as long as at least one pixel 
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in the window is black. Since the center of the mask can be outside the original black 
area when this happens, the figure will be thickened. 


Problem 5.11 


The key to understanding the behavior of the contra-harmonic filter is to think of the pix- 
els in the neighborhood surrounding a noise impulse as being constant, with the impulse 
noise point being in the center of the neighborhood. For the noise spike to be visible, 
its value must be considerably larger than the value of its neighbors. Also keep in mind 
that the power in the numerator is 1 plus the power in the denominator. 

(a) By definition, pepper noise is a low value (really 0). It is most visible when sur- 
rounded by light values. Then center pixel (the pepper noise), will have little influence 
in the sums. If the area spanned by the filter is approximately constant, the ratio will 
approach the value of the pixels in the neighborhood — thus reducing the effect of the 
low-value pixel. For example, here are some values of the filter for a dark point of value 
1 in a 3 x 3 region with pixels of value 100: For Q = 0.5, filter = 98.78; for Q = 1, 
filter = 99.88, for Q = 2, filter = 99.99; and for Q = 5, filter = 100.00. 

(b) The reverse happens when the center point is large and its neighbors are small. The 
center pixel will now be the largest. Flowever, the exponent is now negative, so the small 
numbers will dominate the result. The numerator can then be thought of a constant raised 
to the power Q + 1 and the denominator as a the same constant raised to the power Q. 
That constant is the value of the pixels in the neighborhood. So the ratio is just that 
value. 

(c) When the wrong polarity is used the large numbers in the case of the salt noise will 
be raised to a positive power, thus the noise will overpower the result. For salt noise 
the image will become very light. The opposite is true for pepper noise — the image will 
become dark. 

(d) When Q = — 1, the value of the numerator becomes equal to the number of pixels in 
the neighborhood (m x n). The value of the denominator become sum values, each of 
which is 1 over the value of a pixel in the neighborhood. This is the same as the average 
of 1/A where A is the image average. 

(e) In a constant area, the filter returns the value of the pixels in the area, independently 
of the value of Q. 


http://librosysolucionarios.net 



60 Chapter 5 Problem Solutions 


Problem 5.12 


A bandpass filter is obtained by subtracting the corresponding bandreject filter from 1: 


Then: 


H hp (u,v) = 1 - H hr (u,v). 


(a) Ideal bandpass filter: 

-^ibp (tr, v') 


0 if D(u,v) < D 0 - f 

1 ifD 0 -f<D(u,v)<D 0 + f. 
0 D(u, v) > D 0 + \ 


(b) Butterworth bandpass filter: 

#Bbp(u, V ) 


1 


1 + 


D(u,v)W 


-i 2 n 


D 2 (u,v)—D'q 



D(u,v)W 

2 n 


D 2 (u,v)—Dq 



i + 


D(u,v)W 


D 2 (u,v)-D 2 


(c) Gaussian bandpass filter: 

H Ghv {u,v) 


= 1 - 


= e 


1 — e 


D 2 (u,v)~nf 

D(u,v)W 


D^(u,v)-Df 

D(u,v)W 


Problem 5.13 


A notch pass filter is obtained by subtracting the corresponding notch reject filter from 
1: 


Then: 


-^np (tb tt) 1 v') • 


(a) Ideal notch pass filter: 


-^Inp (tr, v j 


1 if D\ ( u , v ) < D 0 or D 2 ( u , v ) < D 0 

0 otherwise 
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Problem 5.14 


(b) Butterworth notch pass filter: 

-^Bnp(tT) m) 

(c) Gaussian notch pass filter: 

^Gnp(tT, V j 


1 1 

D 2 

n 


D i (u,v)D 2 (u,v) 




L 

1 

n 



\_Di (u,v)D 2 ( u,v ) J 



i + 

r pi 


n 

\_Di (u,v)D 2 ( u,v ) J 




■ 

i \ Pi : 

U,V)D?(U, 


l - 

e”n 

D 2 

i 

'c,(« 



2 


"55 J 



We proceed as follows: 

/» /*oo 

F(w,u) = // f(x,y)e- j2 ^ ux + vv) dxdy 

J J — OO 


r — OO 
/» OO 


ff Asm(uox + v$y)e F^iux + vy) dx dy. 
J J — OO 


Using the exponential definition of the sine function: 

sin# = ~7 (e^ 61 — e~^ e ) 

2j 


gives us 

F(u,v) = 


~jA 

2 

2 

iA 

2 


gj{uox + voy) _ e -j{u 0 x + v 0 y) 


e -j2 <ux + vy) dxdy 


[ f e i 2 ^{uox/2-K + voy/2-K) e -j2-K(ux + vy)^ X( ^y 

,J J —OO 

[ f e -j2ir(u 0 x/2n + voy/2-K) e -j2w(ux + vy) ( i xc ly 

J J — OO 


These are the Fourier transforms of the functions 

1 x e 227T ^ Uo x/ 27T + voy /2 k) 

and 

1 X e ~2 2 ^{.uox/2Tr + v 0 y/2Tr) 

respectively. The Fourier transform of the 1 gives an impulse at the origin, and the 
exponentials shift the origin of the impulse, as discussed in Section 4.6.1. Thus, 

-jA \ s: ( U 0 V 0 \ / M 0 . Mq n 

-2i’ V ^2i)- S ( U+ 2i’ V+ 2i, 


F(u, v) 
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Problem 5.15 


Problem 5.16 


FromEq. (5.4-19) 

= ( 2 »+l) 1 ( 2 l,+ l) E E { l9(7) - “” ,(7)1 - & - 

where “ 7 ” indicates terms affected by the summations. Letting A' = l/(2a+l)(26+l), 
taking the partial derivative of a 2 with respect to w and setting the result equal to zero 
gives 

da 2 

— = - ^(t) - 9 +«>»/] + »i] = ° 

= A 'EE -ghMl) + 9il)v + wrfil) - WT){ 7)7/ + 

Wlil) -Wn- wrfq{i) + wrf 

= 0 

= —gfj + Tjrj + wr/ 2 — wfj 2 + Wi ~ Wi ~~ W V 2 + W V 2 = 0 
= —grj + gri + w (rf — rf^j = 0 
where, for example, we used the fact that 

( 2 a + l) 1 ( 2 t+ I) EE 7(7 > 7(7 > =iW - 

Solving for w gives us 

w = 3V-9V , 

7 f — rj 2 

Finally, inserting the variables x and y, 

, \ _ g(x, y)v(x, y ) - g(x, y)v(x, y) 

V 2 {x, y) — g 2 (x, y) 

which agrees with Eq. (5.4-21). 


From Eq. (5.5-13), 

g(x, y)= /(<*, P)h(x -oi,y- 0) da d/3. 

J J — OO 

It is given that f(x, y) ~ 6(x — a), so f(a, 0) = 6(a — a). Then, using the impulse 
response given in the problem statement, 

g(x,y) = ff 6 (a — a)e“[ (x - a)2+(,/ - /3)2 ] da dp 
J J — OO 
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= If 6(a-a)e-^ x - a)2 1e-l {y -^ 2 Uadp 

J J — OO 

/»oo 

6 (a — a)e~ [ ( “- a)2 ] da / e" df3 

J — OO 


- L 

= e~l {x - 


a) 1 f°° e-^y-^Up 

J — OO 


where we used the fact that the integral of the impulse is nonzero only when a = a. 
Next, we note that 


/ OO _ , /»oo 

e -[(,y-P) 2 ] d /3 = j e -[(0-v )2 ] d/3 

-OO J — OO 


which is in the form of a constant times a Gaussian density with variance a 2 = 1/2 or 
standard deviation a = 1 j \J T 1. In other words. 


e ~[(/ 3 — i/) 2 ] = ^/ 2 tt( 1 / 2 ) 


— ( 1 / 2 ) 


yMiM 

The integral from minus to plus infinity of the quantity inside the brackets is 1, so 

g{x,y) = v / 7re _ [ (a:_a)2 ] 

which is a blurred version of the original image. 


Problem 5.17 


Because the motion in the x- and //-directions are independent (motion is in the vertical 
( x ) direction only at first, and then switching to motion only in the horizontal (y) direc- 
tion) this problem can be solved in two steps. The first step is identical to the analysis 
that resulted in Eq. (5.6-10), which gives the blurring function due to vertical motion 
only: 

HAu,v ) = — — sin( nua)e~ 2 nua , 

7 Tua 

where we are representing linear motion by the equation Xo(t) - at/T t .The function 
Hi (u, v) would give us a blurred image in the vertical direction. That blurred image is 
the image that would then start moving in the horizontal direction and to which horizon- 
tal blurring would be applied. This is nothing more than applying a second filter with 
transfer function 

Ho(u,v) = - s\vi('Kub)e~ 2 ^ ub 

7 rub 

where we assumed the form yo(t) = bl/Tj for motion in the //-direction. Therefore, the 
overall blurring transfer function is given by the product of these two functions: 

H(u,v ) = 7 — sm(TTua) sm(TTub)e~^ ( ' ua ~ ub \ 

(7T ua) {nub) 
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and the overall blurred image is 

g(x,y) = [H(u,v)F(u,v)] 

where F( u. v ) is the Fourier transform of the input image. 


Problem 5.18 


Following the procedure in Section 5.6.3, 

r T 

H (u, v) = / e~ j2 * UXo w dt 

Jo 

= [ T e- j2 ™[( 1/2)ai *]dt 

Jo 

rT 

= / e- j ™ at2 dt 


where 


and 


cos{nuat 2 ) — j sm(nuat 2 ) dt 




27 xuaT 2 


C(y/nuaT) — jS(y/nuaT) 



These are Fresnel cosine and sine integrals. They can be found, for example, the Hand- 
book of Mathematical Functions , by Abramowitz, or other similar reference. 


Problem 5.19 


A basic approach for restoring a rotationally blurred image is to convert the image from 
rectangular to polar coordinates. The blur will then appear as one-dimensional uniform 
motion blur along the 0-axis. Any of the techniques discussed in this chapter for han- 
dling uniform blur along one dimension can then be applied to the problem. The image 
is then converted back to rectangular coordinates after restoration. The mathematical 
solution is simple. For any pixel with rectangular coordinates (x, y ) we generate a cor- 
responding pixel with polar coordinates (r, 9), where 

r = \J x" 1 + y 2 
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and 

0 = tan -1 (- 
\x 

A display of the resulting image would shown an image that is blurred along the 0-axis 
and would, in addition, appear distorted due to the coordinate conversion. Since the 
extent of the rotational blur is known (it is given as 7r/8 radians), we can use the same 
solution we used for uniform linear motion (Section 5.6.3), with x - 0 and y = r 

to obtain the transfer function. Any of the methods in Sections 5.7 through 5.9 then 
become applicable. 


Problem 5.20 


Measure the average value of the background. Set all pixels in the image, except the 
cross hairs, to that gray level. Denote the Fourier transform of this image by G(u,v). 
Since the characteristics of the cross hairs are given with a high degree of accuracy, 
we can construct an image of the background (of the same size) using the background 
gray levels determined previously. We then construct a model of the cross hairs in the 
correct location (determined from he given image) using the provided dimensions and 
gray level of the crosshairs. Denote by F(u, v) the Fourier transform of this new image 
. The ratio G(u,v)/F(u,v) is an estimate of the blurring function H(u, v). In the likely 
event of vanishing values in F(u, v), we can construct a radially-limited filter using the 
method discussed in connection with Fig. 5.27. Because we know F (u, v) and G(u, v), 
and an estimate of H(u,v), we can also refine our estimate of the blurring function 
by substituting G and H in Eq. (5.8-3) and adjusting K to get as close as possible to a 
good result for F(u, v) [the result can be evaluated visually by taking the inverse Fourier 
transform]. The resulting filter in either case can then be used to deblur the image of the 
heart, if desired. 


Problem 5.21 


The key to solving this problem is to recognize that the given function 


h(r) = 


r _ ~ a — r 2 /2tr 2 


where r 1 — x 2 +y z , is the Laplacian (second derivative with respect to r) of the function 

h 0 (r)=e- r2 ' 2 °\ 

That is, V 2 [/zq (r)] is equal to the given function. Then we know from Eq. (4.4-7) that. 
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for a function f(x, y), 

%[V 2 f(x,y)} = -(u 2 +v 2 )F(u,v). 

2 In 2 

Thus, we have reduced the problem to finding the Fourier transform of e~ r , which 
is in the form of a Gaussian function. From Table 4.1, we note from the Gaussian 
transform pair that the Fourier transform of a function of the form e~ (,;/ ' 2+ir, ' /2 ' T ~ is 


0 -(x 2 +y 2 )/2<j 2 


= V^ae~ 2 ^ 2 ^ + y 2 \ 


Therefore, the Fourier transform of the given degradation function is 


H(u,v) 


r " ~ 0-2 -r 2 /2a 2 

a 4 

2 , „. 2 \ 


= Q[V 2 h 0 (r) 


= -(u J -hv J )F(u,v) 

= -V^a(u 2 +v 2 )e - M ^ 2+v2 


Problem 5.22 


This is a simple plugin problem. Its purpose is to gain familiarity with the various terms 
of the Wiener filter. From Eq. (5.8-3), 


where 


H w (u,v ) 


1 \H(u,v)\ 2 

H{u,v) \H(u,v )\ 2 + K 


\H(u,v)\ 2 = H*(u,v)H(u,v) 

= 2tt aV + ^) 2 e“ 4w2ff2(x2+y2) - 


Then, 


H w (u,v ) = - 


v / 27 i<j{u 2 + v 2 )e ^^(P+v 2 


[27 ra 2 (u 2 + t,2)2g-47rV 2 (x 2 +y 2 )j + 


Problem 5.23 


This also is a simple plugin problem, whose purpose is the same as the previous problem. 
From Eq. (5.9-4) 


H c (u,v) 


H*(u, v) 

\H(u, f)| 2 + 7 |P(u, f)| 2 

^a(u 2 + r) 2 )e _27r2cr2 ^ 2+!/2 ) 


27 rcr 2 (w 2 + 7) 2 ) 2 e- 47r2cr2 ( x2 +y 2 ) + 7 \P(u, t>)| 2 

where P(u, v) is the Fourier transform of the Laplacian operator [Eq. (5.9-5)]. This is 
as far as we can reasonably carry this problem. It is worthwhile pointing out to students 
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that a closed expression for the transform of the Laplacian operator was obtained in 
Problem 4.19. However, substituting that solution for P(u, v) here would only increase 
the number of terms in the filter and would not aid at all in simplifying the expression. 


Problem 5.24 


Because the system is assumed linear and position invariant, it follows that Eq. (5.5-17) 
holds. Furthermore, we can use superposition and obtain the response of the system 
first to F(u, v ) and then to N(u, v). The sum of the two individual responses gives the 
complete response. First, using only F(u, v), 

Gi(u,v) = H{u,v)F{u,v) 


and 


|Ci(u, f)| 2 = \H{u,v)\ 2 | F(u,v)\* 


Then, using only N(u, v), 


and 


G 2 (u, v) = N(u,v) 


|G 2 (w,u)| 2 = |(V(u,'u)|" 


so that 


\G(u, f)| 2 = \G 1 {u,v)\ 2 + \G 2 {u,v)\ 2 

= \H(u,v)\ 2 \F{u,v)\ 2 + \N(u,v)\ 2 . 


Problem 5.25 


(a) It is given that 
From Problem 5.24, 


F(u,v) = \R{u,v)\ 2 \G(u,v)\" 


F{u,v) = \R(u, t>)| 2 \H(u,v)\ 2 \F{u,v)\ 2 + |77(u,t;)| z 


Forcing 


F(u,v) to equal \F(u, t>)| 2 gives 


R(u,v) = 


I F(u,v)\' 


I H(u, v)\ 2 1 F(u, v)\ 2 + \N{u,v)\ 2 


1/2 
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(b) 


F(u,v) = R{u,v)G(u,v ) 


I fmy 


1/2 


\H(u,v)\ \F{u,v)\ + |iV(u, u)| 

-1 1/2 

G(u, v ) 


G(u, v ) 


1 


and, because \F(u, u)| = Sf(u, v) and \N(u, u)| = S v (u,v), 

1 1/2 


F(u,v) = 


1 


i"(“.»)i 2 +i)feS 


G(u, v). 


Problem 5.26 


One possible solution: (1) Average images to reduce noise. (2) obtain blurred image of 
a bright, single star to simulate an impulse (the star should be as small as possible in 
the field of view of the telescope to simulate an impulse as closely as possible. (3) The 
Fourier transform of this image will give H (a. v). (4) Use a Wiener filter and vary K 
until the sharpest image possible is obtained. 


Problem 5.27 


The basic idea behind this problem is to use the camera and representative coins to 

model the degradation process and then utilize the results in an inverse filter operation. 

The principal steps are as follows: 

1 . Select coins as close as possible in size and content as the lost coins. Select a back- 
ground that approximates the texture and brightness of the photos of the lost coins. 

2. Set up the museum photographic camera in a geometry as close as possible to give 
images that resemble the images of the lost coins (this includes paying attention to 
illumination). Obtain a few test photos. To simplify experimentation, obtain a TV 
camera capable of giving images that resemble the test photos. This can be done by 
connecting the camera to an image processing system and generating digital images, 
which will be used in the experiment. 

3. Obtain sets of images of each coin with different lens settings. The resulting images 
should approximate the aspect angle, size (in relation to the area occupied by the 
background), and blur of the photos of the lost coins. 
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4. The lens setting for each image in (3) is a model of the blurring process for the 
corresponding image of a lost coin. For each such setting, remove the coin and 
background and replace them with a small, bright dot on a uniform background, 
or other mechanism to approximate an impulse of light. Digitize the impulse. Its 
Fourier transform is the transfer function of the blurring process. 

5. Digitize each (blurred) photo of a lost coin, and obtain its Fourier transform. At this 
point, we have H (u, v) and G(u, v) for each coin. 

6. Obtain an approximation to F(u,v ) by using a Wiener filter. Equation (5.8-3) is 
particularly attractive because it gives an additional degree of freedom (K) for ex- 
perimenting. 

7. The inverse Fourier transform of each approximate F(u, v ) gives the restored image. 
In general, several experimental passes of these basic steps with various different 
settings and parameters are required to obtain acceptable results in a problem such 
as this. 


Problem 5.28 


Using triangular regions means three tiepoints, so we can solve the following set of 
linear equations for six coefficients: 

x! = cix + c 2 y + c 3 


y' = C 4 X + c 5 y + c 6 

to implement spatial transformations. We also solve the following equation for three 
coefficients 

v(x', y') = ax' +by r + c 

to implement gray level interpolation. 
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Problem 6.1 


Problem 6.2 


From the figure, x = 0.43 and y = 0.4. Since x + y + z = 1, it follows that 2 = 0.17. 
These are the trichromatic coefficients. We are interested in tristimulus values X, Y, 
and Z, which are related to the trichromatic coefficients by Eqs. (6.1-1) through (6.1-3). 
We note however, that all the tristimulus coefficients are divided by the same constant, 
so their percentages relative to the trichromatic coefficients are the same as those of the 
coefficients. Thus, the answer is X = 0.43, Y = 0.40, and Z = 0.17. 


Denote by c the given color, and let its coordinates be denoted by (xo, yo ) • The distance 
between c and Ci is 


d(c,c i) = (so - xxf + (y 0 - t/i ) 2 

Similarly the distance between Ci and C 2 


1/2 


d(d,c 2 )= {x\ — x 2 ) 2 + (yi — y 2 y 


I 1/2 


The percentage pi of C\ in c is 
Pi = 


d( C ,.C;)-,i(c.C,) xloo 

d(Ci,c 2 ) 


The percentage p 2 of C 2 is simply p 2 = 100 — p \ . In the preceding equation we see, 
for example, that when c = Ci, then d(c, c-\ ) = 0 and it follows that p\ = 100% 
and p 2 = 0%. Similarly, when d{c, C\) = d(ci,C2), it follows that pi = 0% and 
p 2 = 100%. Values in between are easily seen to follow from these simple relations. 
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Problem 6.3 


Consider Fig. P 6 . 3 , in which Ci, C2. and C3 are the given vertices of the color triangle 
and c is an arbitrary color point contained within the triangle or on its boundary. The 
key to solving this problem is to realize that any color on the border of the triangle is 
made up of proportions from the two vertices defining the line segment that contains the 
point. The contribution to a point on the line by the color vertex opposite this line is 0 % 


The line segment connecting points C3 and c is shown extended (dashed segment) until 
it intersects the line segment connecting C\ and C2 . The point of intersection is denoted 
Co- Because we have the values of Ci and c 2 , if we knew Cq, we could compute the 
percentages of Ci and C2 contained in Co by using the method described in Problem 6 . 2 . 
Denote the ratio of the content of C\ and C2 in Co be denoted by R\ 2 . If we now add 
color C3 to Co, we know from Problem 6.2 that the point will start to move toward C3 
along the line shown. For any position of a point along this line we could determine the 
percentage of C3 and Co, again, by using the method described in Problem 6 . 2 . What is 
important to keep in mind that the ratio R\ 2 will remain the same for any point along 
the segment connecting C3 and cq . The color of the points along this line is different for 
each position, but the ratio of Ci to C2 will remain constant. 

So, if we can obtain Co, we can then determine the ratio R\ 2 , and the percentage of 
C3, in color c. The point c (l is not difficult to obtain. Let y = a\ 2 x + b r2 be the 
straight line containing points Ci and C2, and y = a^ c x + 63c the line containing C3 and 
c. The intersection of these two lines gives the coordinates of Co- The lines can be 
determined uniquely because we know the coordinates of the two point pairs needed to 
determine the line coefficients. Solving for the intersection in terms of these coordinates 
is straightforward, but tedious. Our interest here is in the fundamental method, not the 
mechanics of manipulating simple equations so we don not give the details. 

At this juncture we have the percentage of C3 and the ratio between c-\ and C2. Let the 
percentages of these three colors composing c be denoted by pi, p 2 , and p$ respectively. 
Since we know thatpi +p 2 = 100 — y>3, and thatpi/p2 = R12 , we can solve forpi and 
p 2 . Finally, note that this problem could have been solved the same way by intersecting 
one of the other two sides of the triangle. Going to another side would be necessary, for 
example, if the line we used in the preceding discussion had an infinite slope. A simple 
test to determine if the color of c is equal to any of the vertices should be the first step in 
the procedure; in this case no additional calculations would be required. 
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y (Green) 



Problem 6.4 

Use color filters sharply tuned to the wavelengths of the colors of the three objects. 
Thus, with a specific filter in place, only the objects whose color corresponds to that 
wavelength will produce a predominant response on the monochrome camera. A mo- 
torized filter wheel can be used to control filter position from a computer. If one of the 
colors is white, then the response of the three filters will be approximately equal and 
high. If one of the colors is black, the response of the three filters will be approximately 
equal and low. 

Problem 6.5 

At the center point we have 

-R + —B + G = -(R+G + B) + —G = midgray + —G 
which looks to a viewer like pure green with a boot in intensity due to the additive gray 
component. 

Problem 6.6 


For the image given, the maximum intensity and saturation requirement means that the 
RGB component values are 0 or 1. We can create the following table with 0 and 255 
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representing black and white, respectively: 


Table P6.6 


Color 

R 

G 

B 

Mono R 

Mono G 

Mono B 

Black 

0 

0 

0 

0 

0 

0 

Red 

1 

0 

0 

255 

0 

0 

Yellow 

1 

1 

0 

255 

255 

0 

Green 

0 

1 

0 

0 

255 

0 

Cyan 

0 

1 

1 

0 

255 

255 

Blue 

0 

0 

1 

0 

0 

255 

Magenta 

1 

0 

1 

255 

0 

255 

White 

1 

1 

1 

255 

255 

255 

Gray 

0.5 

0.5 

0.5 

128 

128 

128 


Thus, we get the monochrome displays shown in Fig. P6.6. 



Red Component Green Component Blue Component 


Figure P6.6 


Problem 6.7 


There are 2 8 = 256 possible values in each 8-bit image. For a color to be gray, all RGB 
components have to be equal, so there are 256 shades of gray. 


Problem 6.8 


(a) All pixel values in the Red image are 255. In the Green image, the first column is 
all 0’s; the second column all 1 ’s; and so on until the last column, which is composed of 
all 255 ’s. In the Blue image, the first row is all 255 ’s; the second row all 254’s, and so 
on until the last row which is composed of all 0’s. 
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(b) Let the axis numbering be the same as in Fig. 6.7. Then: (0, 0, 0) = white, 
(1,1,1,) = black, (1,0,0) = cyan, (1,1,0) = blue, (1,0,1) = green, (0,1,1) = 
red, (0, 0, 1) = yellow, (0, 1,0)= magenta. 

(c) The ones that do not contain the black or white point are fully saturated. The others 
decrease in saturation from the corners toward the black or white point. 


Problem 6.9 


(a) For the image given, the maximum intensity and saturation requirement means that 
the RGB component values are 0 or 1. We can create Table P6.9 using Eq. (6.2-1 ): 


Table P6.9 


Color 

R 

G 

B 

c 

M 

Y 

Mono C 

Mono M 

Mono Y 

Black 

0 

0 

0 

1 

1 

1 

255 

255 

255 

Red 

1 

0 

0 

0 

1 

1 

0 

255 

255 

Yellow 

1 

1 

0 

0 

0 

1 

0 

0 

255 

Green 

0 

1 

0 

1 

0 

1 

255 

0 

255 

Cyan 

0 

1 

1 

1 

0 

0 

255 

0 

0 

Blue 

0 

0 

1 

1 

1 

0 

255 

255 

0 

Magenta 

1 

0 

1 

0 

1 

0 

0 

255 

0 

White 

1 

1 

1 

0 

0 

0 

0 

0 

0 

Gray 

0.5 

0.5 

0.5 

0.5 

0.5 

0.5 

128 

128 

128 


Thus, we get the monochrome displays shown in Fig. P6.9(a). 

(b) The resulting display is the complement of the starting RGB image. From left to 
right, the color bars are (in accordance with Fig. 6.32) white, cyan, blue, magenta, red, 
yellow, green, and black. The middle gray background is unchanged. 



Cyan Component Magenta Component Yellow Component 


Figure P6.9 
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Problem 6.10 


Equation (6.2-1 ) reveals that each component of the CMY image is a function of a single 
component of the corresponding RGB image — C is a function of R, M of G, and Y of 
D. For clarity, we will use a prime to denote the CMY components. From Eq. (6.5-6), 
we know that 

si = kn 

for i = 1,2,3 (for the R, G, and B components). And from Eq. (6.2-1), we know 
that the CMY components corresponding to the r, and s, (which we are denoting with 
primes) are 


Thus, 


Si — 1 — Si- 
n = 1 — r{ 


and 


so that 


Si = 1 - Si = 1 - kri = 1 - k (1 - n ) 


s{ = kr{+ (1 - k) . 


Problem 6.11 


(a) The purest green is 00FF00, which corresponds to cell (7, 18). 

(b) The purest blue is 0000FF, which corresponds to cell (12, 13). 


Problem 6.12 


Using Eqs. (6.2-2) through (6.2-4), we get the results shown in Table P6. 12. Note that, in 
accordance with Eq. (6.2-2), hue is undefined when R = G = B since 9 = cos -1 (j-j. 
In addition, saturation is undefined when R = G = B = 0 since Eq. (6.2-3) yields 


http://librosysolucionarios.net 



Problem 6.13 77 


S = 1 — 3 To (0) = 1 — Thus, we get the monochrome display shown in Fig. P6.12. 

Table P6.12 


Color 

R 

G 

B 

H 

S 

I 

Mono H 

Mono S 

Mono I 

Black 

0 

0 

0 

- 

0 

0 

- 

- 

0 

Red 

1 

0 

0 

0 

1 

0.33 

0 

255 

85 

Yellow 

1 

1 

0 

0.17 

1 

0.67 

43 

255 

170 

Green 

0 

1 

0 

0.33 

1 

0.33 

85 

255 

85 

Cyan 

0 

1 

1 

0.5 

1 

0.67 

128 

255 

170 

Blue 

0 

0 

1 

0.67 

1 

0.33 

170 

255 

85 

Magenta 

1 

0 

1 

0.83 

1 

0.67 

213 

255 

170 

White 

1 

1 

1 

- 

0 

1 

- 

0 

255 

Gray 

0.5 

0.5 

0.5 

- 

0 

0.5 

- 

0 

128 



Figure P6.12 


Problem 6.13 


With reference to the HSI color circle in Fig. 6.14(b), deep purple is found at approxi- 
mately 270°. To generate a color rectangle with the properties required in the problem 
statement, we choose a fixed intensity /, and maximum saturation (these are spectrum 
colors, which are supposed to be fully saturated), S. The first column in the rectangle 
uses these two values and a hue of 270°. The next column (and all subsequent columns) 
would use the same values of I and S, but the hue would be decreased to 269°, and so 
on all the way down to a hue of 0°, which corresponds to red. If the image is limited to 
8 bits, then we can only have 256 variations in hue in the range from 270° down to 0°, 
which will require a different uniform spacing than one degree increments or, alterna- 
tively, starting at a 255° and proceed in increments of 1, but this would leave out most of 
the purple. If we have more than eight bits, then the increments can be smaller. Longer 
strips also can be made by duplicating column values. 
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Problem 6.14 

There are two important aspects to this problem. One is to approach it in HSI space 
and the other is to use polar coordinates to create a hue image whose values grow as a 
function of angle. The center of the image is the middle of whatever image area is used. 
Then, for example, the values of the hue image along a radius when the angle is 0° would 
be all 0’s. The angle then is incremented by, say, one degree, and all the values along 
that radius would be 1 ’s, and so on. Values of the saturation image decrease linearly 
in all radial directions from the origin. The intensity image is just a specified constant. 
With these basics in mind it is not difficult to write a program that generates the desired 
result. 

Problem 6.15 

The hue, saturation, and intensity images are shown in Fig. P6.15, from left to right. 



Figure P6.15 


Problem 6.16 


(a) It is given that the colors in Fig. 6.16(a) are primary spectrum colors. It also is 
given that the gray-level images in the problem statement are 8-bit images. The latter 
condition means that hue ( angle) can only be divided into a maximum number of 256 
values. Since hue values are represented in the interval from 0° to 360° this means 
that for an 8-bit image the increments between contiguous hue values are now 360/255. 
Another way of looking at this is that the entire [0, 3601 hue scale is compressed to the 
range [0, 255]. Thus, for example, yellow (the first primary color we encounter), which 
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is 60° now becomes 43 ( the closest integer) in the integer scale of the 8-bit image shown 
in the problem statement. Similarly, green, which is 120° becomes 85 in this image. 
From this we easily compute the values of the other two regions as being 170 and 213. 
The region in the middle is pure white [equal proportions of red green and blue in Fig. 
6.61(a)] so its hue by definition is 0. This also is true of the black background. 

(b) The colors are spectrum colors, so they are fully saturated. Therefore, the values 
shown of 255 applies to all circle regions. The region in the center of the color image is 
white, so its saturation is 0. 

(c) The key to getting the values in this figure is to realize that the center portion of the 
color image is white, which means equal intensities of fully saturated red, green, and 
blue. Therefore, the value of both darker gray regions in the intensity image have value 
85 (i.e., the same value as the other corresponding region). Similarly, equal proportions 
of the secondaries yellow, cyan, and magenta produce white, so the two lighter gray 
regions have the same value (170) as the region shown in the figure. The center of the 
image is white, so its value is 255. 


Problem 6.17 


(a) Because the infrared image which was used in place of the red component image has 
very high gray-level values. 

(b) The water appears as solid black (0) in the near infrared image [Fig. 6.27(d)]. 
Threshold the image with a threshold value slightly larger than 0. The result is shown 
in Fig. P6. 17. It is clear that coloring all the black points in the desired shade of blue 
presents no difficulties. 

(c) Note that the predominant color of natural terrain is in various shades of red. We al- 
ready know how to take out the water from (b). Thus a method that actually removes the 
’’background” of red and black would leave predominantly the other man-made struc- 
tures, which appear mostly in a bluish light color. Removal of the red [and the black 
if you do not want to use the method as in (b)] can be done by using the technique 
discussed in Section 6.7.2. 
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Problem 6.18 

Using Eq. (6.2-3), we see that the basic problem is that many different colors have the 
same saturation value. This was demonstrated in Problem 6.12, where pure red, yellow, 
green, cyan, blue, and magenta all had a saturation of 1 . That is, as long as any one of 
the RGB components is 0, Eq. (6.2-3) yields a saturation of 1 . 

Consider RGB colors (1, 0, 0) and (0, 0.59, 0), which represent a red and a green. 
The HSI triplets for these colors [per Eq. (6.4-2) through (6.4-4)] are (0, 1, 0.33) and 
(0.33, 1, 0.2), respectively. Now, the complements of the beginning RGB values (see 
Section 6.5.2) are (0, 1, 1) and (1, 0.41, 1), respectively; the corresponding colors are 
cyan and magenta. Their HSI values [per Eqs. (6.4-2) through (6.4-4)] are (0.5, 1, 0.66) 
and (0.83, 0.48, 0.8), respectively. Thus, for the red, a starting saturation of 1 yielded 
the cyan “complemented” saturation of 1, while for the green, a starting saturation of 
1 yielded the magenta “complemented” saturation of 0.48. That is, the same starting 
saturation resulted in two different “complemented” saturations. Saturation alone is not 
enough information to compute the saturation of the complemented color. 

Problem 6.19 


The complement of a color is the color opposite it on the color circle of Fig. 6.32. The 
hue component is the angle from red in a counterclockwise direction normalized by 360 
degrees. For a color on the top half of the circle (i.e., 0 < H < 0.5), the hue of the 
complementary color is H + 0.5. For a color on the bottom half of the circle (i.e., for 
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0.5 < H < 1), the hue of the complement is H 0.5. 


Problem 6.20 


The RGB transformations for a complement [from Fig. 6.33(b)] are: 

Si = 1 - Vi 

where i = 1,2,3 (for the R, G, and B components). But from the definition of the 
CMY space in Eq. (6.2-1), we know that the CMY components corresponding to r, and 
Si, which we will denote using primes, are 

r{= 1-U 

Si — 1 Si . 

Thus, 

U = 1 - r{ 

and 

S{= 1 - Si = 1 - (1 - Ti) = 1 - (1 - (1 - Ti)) 

so that 

s' = 1 — rS. 


Problem 6.21 


The RGB transformation should darken the highlights and lighten the shadow areas, 
effectively compressing all values toward the midtones. The red, green, and blue com- 
ponents should be transformed with the same mapping function so that the colors do not 
change. The general shape of the curve would be as shown in Fig. P6.21. 



Figure P6.21 
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Problem 6.22 


Based on the discussion is Section 6.5.4 and with reference to the color wheel in Fig. 
6.32, we can decrease the proportion of yellow by (1) decreasing yellow, (2) increasing 
blue, (3) increasing cyan and magenta, or (4) decreasing red and green. 


Problem 6.23 


The L*a*b* components are computed using Eqs. (6.5-9) through (6.5-12). Reference 
white is R = G = B = 1. The computations are best done in a spreadsheet, as shown 
in Table P6.23. 


Table P6.23 


Color 

R 

G 

B 

X 

y 

z 

X 

X w 

Y 

Y w 

i ' 

■a) 

*(U 


1 t* 

a* 

b* 

Ref. 

1 

1 

1 

0.95 

1.00 

1.10 

1 

1 

i 

1 

1 

1 

100 

0 

0 

Black 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.14 

0.14 

0.14 

0 

0 

0 

Red 

1 

0 

0 

0.59 

0.29 

0 

0.62 

0.29 

0 

0.85 

0.66 

0.14 

83 

95 

105 

Yellow 

1 

1 

0 

0.77 

0.90 

0.07 

0.81 

0.90 

0.06 

0.93 

0.96 

0.40 

92 

-16 

113 

Green 

0 

1 

0 

0.18 

0.61 

0.07 

0.19 

0.61 

0.06 

0.57 

0.85 

0.40 

51 

-136 

90 

Cyan 

0 

1 

1 

0.36 

0.71 

1.09 

0.38 

0.71 

1 

0.73 

0.89 

1 

68 

-84 

-22 

Blue 

0 

0 

1 

0.18 

0.11 

1.02 

0.19 

0.11 

0.94 

0.58 

0.47 

0.98 

51 

53 

-101 

Magenta 

1 

0 

1 

0.77 

0.40 

1.02 

0.81 

0.40 

0.94 

0.93 

0.73 

0.98 

92 

100 

-49 

White 

1 

1 

1 

0.95 

1.00 

1.10 

1 

1 

1 

1 

1 

1 

100 

0 

0 

Gray 

0.5 

0.5 

0.5 

0.48 

0.50 

0.55 

0.5 

0.5 

0.5 

0.79 

0.79 

0.79 

76 

0 

0 


Problem 6.24 


The conceptually simplest approach is to transform every input image to the HSI color 
space, perform histogram specification per the discussion in Section 3.3.2 on the inten- 
sity (/) component only (leaving H and S alone), and convert the resulting intensity 
component with the original hue and saturation components back to the starting color 
space. 


Problem 6.25 


(a) The boundary between red and green becomes thickened and yellow as a result of 
blurring between the red and green primaries (recall that yellow is the color between 
green and red in, for example. Fig. 6.14). The boundary between green and blue is 
similarly blurred into a cyan color. The result is shown in Fig. P6.25. 
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(b) Blurring has no effect in this case. The intensity image is constant (at its maximum 
value) because the pure colors are fully saturated. 


Problem 6.26 


This is a simple problem to encourage the student to think about the meaning of the 
elements in Eq. (6.7-2). When C = I, it follows that C 1 = I and Eq. (6.7-2) becomes 
D{z, a) = [(z - a) T (z - a)] 1/2 . 

But the term inside the brackets is recognized as the inner product of the vector (z — a) 
with itself, which, by definition, is equal to the right side of Eq. (6.7-1 ). 


Green 


Red 




Blue 


v_ 


Yellow 


Cyan 


Figure P6.25 


Problem 6.27 


(a) The cube is composed of 6 intersecting planes in RGB space. The general equation 
for such planes is 

azn + bzG+czB+d = 0 

where a, b, c, and d are parameters and the z’s are the components of any point (vector) 
z in RGB space lying on the plane. If an RGB point z does not lie on the plane, and 
its coordinates are substituted in the preceding equation, then equation will give either a 
positive or a negative value; it will not yield zero. We say that z lies on the positive or 
negative side of the plane, depending on whether the result is positive or negative. We 
can change the positive side of a plane by multiplying its coefficients (except d) by — 1. 
Suppose that we test the point a given in the problem statement to see whether it is on 
the positive or negative side each of the six planes composing the box, and change the 
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coefficients of any plane for which the result is negative. Then, a will lie on the positive 
side of all planes composing the bounding box. In fact all points inside the bounding 
box will yield positive values when their coordinates are substituted in the equations of 
the planes. Points outside the box will give at least one negative or zero value. Thus, 
the method consists of substituting an unknown color point in the equations of all six 
planes. If all the results are positive, the point is inside the box; otherwise it is outside 
the box. A flow diagram is asked for in the problem statement to make it simpler to 
evaluate the student’s line of reasoning. 

(b) If the box is lined up with the RGB coordinate axes, then the planes intersect the 
RGB coordinate planes perpendicularly. The intersections of pairs of parallel planes 
establish a range of values along each of the RGB axis that must be checked to see if 
the if an unknown point lies inside the box or not. This can be done on an image per 
image basis (i.e., the three component images of an RGB image), designating by 1 a 
coordinate that is within its corresponding range and 0 otherwise. These will produce 
three binary images which, when ANDed, will give all the points inside the box. 


Problem 6.28 


The sketch is an elongated ellipsoidal figure in which the length lined up with the R-axis 
is 8 times longer that the other two dimensions. In other words, the figure looks like a 
blimp aligned with the .R-axis. 


Problem 6.29 


Set one of the three primary images to a constant value (say, 0), then consider the two 
images shown in Fig. P6.29. If we formed an RGB composite image by letting the im- 
age on the left be the red component and the image on the right the green component, 
then the result would be an image with a green region on the left separated by a vertical 
edge from a red region on the right. To compute the gradient of each component image 
we take second-order partial derivatives. In this case, only the component of the deriv- 
ative in the horizontal direction is nonzero. If we model the edge as a ramp edge [Fig. 
3.38(b)] then a profile of the derivative image would appear as shown in Fig. P6.29. The 
magnified view shows clearly that the derivatives of the two images are mirrors of each 
other. Thus, if we computed the gradient vector of each image and added the results as 
suggested in the problem statement, the components of the gradient would cancel out, 
giving a zero gradient for a color image that has a clearly defined edge between two dif- 
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ferent color regions. This simple example illustrates that the gradient vector of a color 
image is not equivalent to the result of forming a color gradient vector from the sum of 
the gradient vectors of the individual component images. 



Figure P6.29 
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7 Problem Solutions 


Problem 7.1 


Following the explanation in Example 7.1, the decoder is as shown in Fig. P7.1 


Level j - 1 
approximation 


Level j 
prediction 
residual 


2t 


Upsampler 


Interpolation 

filter 


Prediction 


■o 


Level j 

approximation 


Figure P7.1 


Problem 7.2 


A mean approximation pyramid is formed by forming 2x2 block averages. Since the 
starting image is of size 4 x 4, J = 2, and f(x, y) is placed in level 2 of the mean 
approximation pyramid. The level 1 approximation is (by taking 2x2 block averages 
over f(x, y ) and subsampling): 


3.5 5.5 

11.5 13.5 


and the level 0 approximation is similarly [8.5]. The completed mean approximation 
pyramid is 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 


3.5 5.5 

11.5 13.5 - 
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Problem 7.3 


Since no interpolation filtering is specified, pixel replication is used in the generation of 
the mean prediction residual pyramid levels. Level 0 of the prediction residual pyramid 
is the lowest resolution approximation, [8.5]. The level 2 prediction residual is obtained 
by upsampling the level 1 approximation and subtracting it from the level 2 (original 
image). Thus, we get 


1 

2 

3 

4 


3.5 

3.5 

5.5 

5.5 

5 

6 

7 

8 


3.5 

3.5 

5.5 

5.5 

9 

10 

11 

12 


11.5 

11.5 

13.5 

13.5 

13 

14 

15 

16 


11.5 

11.5 

13.5 

13.5 


-2.5 

-1.5 

-2.5 

-1.5 

1.5 

2.5 

1.5 

2.5 

-2.5 

-1.5 

-2.5 

-1.5 

1.5 

2.5 

1.5 

2.5 


Similarly, the level 1 prediction residual is obtained by upsampling the level 0 approxi- 
mation and subtracting it from the level 1 approximation to yield 


3.5 

5.5 


’ 8.5 

8.5 ’ 


1 

CO 

lO 
1 

11.5 

13.5 


8.5 

8.5 


3 5 


The mean prediction residual pyramid is therefore 


-2.5 

-1.5 

-2.5 

-1.5 

1.5 

2.5 

1.5 

2.5 

-2.5 

-1.5 

-2.5 

-1.5 

1.5 

2.5 

1.5 

2.5 


-5 

3 


-3 

5 


[8.5] . 


The number of elements in a J + 1 level pyramid is bounded by 4/3 (see Section 7.1.1): 


~)2J 


l l 

1 H i — I o 

(4) 1 (4) 2 


(4) J 


< -2 2J 
~ 3 
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for J > 0. We can generate Table P7.3: 

Table P7.3 


J 

Pyramid Elements 

Compression Ratio 

0 

1 

1 

1 

5 

5/4= 1.25 

2 

21 

21/16 = 1.3125 

3 

85 

85/86 = 1.328 

00 


4/3 = 1.33 


All but the trivial case ( J = 0) are expansions. The expansion factor is a function of 
and bounded by 4/3 or 1.33. 


Problem 7.4 


(a) The QMF filters must satisfy Eqs. (7.1-9) and (7.1-10). From Table 7.1, Go(z) = 
H 0 (z) and Hi(z) = H 0 (—z), so H\{— z) = Hq(z). Thus, beginning with Eq. (7.1-9), 

H 0 (-z)G 0 (z) + H 1 (-z)G 1 (z) = 0 
H 0 (-z)H 0 (z)-H 0 (z)H 0 (-z) = 0 

0 = 0 . 

Similarly, beginning with Eq. (7.1-10) and substituting for H\(z), Go(z), and Gi(z) 
from rows 2, 3, and 4 of Table 7. 1, we get 

H 0 (z)G 0 (z) + H 1 (z)G 1 (z) = 2 
H 0 (z)H 0 (z) + H 0 (-z)[-Ho(-z)\ = 2 

Ho(z) — Hq(—z) = 2 

which is the design equation for the H 0 (z ) prototype filter in row 1 of the table. 

(b) The orthonormal filter proof follows the QMF proof in (a). For Eq. (7.1-9), we get 

H 0 (—z)Go(z) + Hi(z)Gi(z) = 0 
G 0 [(-z)~ l ]G 0 (z) + Gi[(-z)~ l }[-z~ 2K+1 G 0 (-z~ l )\ = 0 

G 0 {-z~ 1 )G 0 {z) - z^ 2K+1 G 1 (-z~ 1 )G 0 {-z~ 1 ) = 0 

G 0 (-z- 1 )Go(z)-z- 2K+1 {^(-z- 1 )- 2K+1 Go(-{-z- l }- 1 )}Go(-z- 1 ) = 0 

Go{—z~ 1 )G 0 (z) - z~ 2K+1 {z 2K ^ 1 G 0 {z))Gq{—z^ 1 ) = 0 

Gg(—z 1 )G f o( 2 ;) — Go{z)Go(—z 1 ) = 0. 

Similarly, beginning with Eq. (7.1-10), 
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Ho(z)G 0 (z) + H 1 (z)G 1 (z) = 2 

Gq(z 1 )Gq(z) + Gi(z 1 )G±(z) = 2 

G 0 (z~ l )G 0 {z) + [-{-z~ 1 )~ 2K+1 G 0 {-[-z~ l ]~ 1 ))[-z~ 2K+1 G 0 {-z~ 1 )\ = 2 

G 0 {z~ 1 )G 0 (z) + (-z- 2 K+ 1 )(-z~ [ ~ 2 K+ 1 ] )Go(-z)G 0 (-z~ 1 ) = 2 

Go(z 1 )G 0 (z) + G 0 (—z)G 0 (—z 1 ) = 2 

which is the design equation for the Gq(z) prototype filter in row 3 of the table. 


Problem 7.5 


To be biorthogonal, QMF filters must satisfy matrix Eq. (7.1-13). Letting 

2 

det[H TO (^)] 

in that expression we can write 

Go(z) = aH 1 (—z) 

Gi(z) = -aH 0 (-z) 

and see that the QMF filters in column 1 of Table 7.1 do satisfy it with a - 1 . Thus, 
QMF filters are biorthogonal. They are not orthonormal, however, since they do not 
satisfy the requirements of column 3 in Table 7.1. For QMF filters, for instance, 

H 1 (z)=H 0 {-z ) = -Gr(z) 

but orthonormality (see column 3) requires that Hi(z) = G\ (z^ 1 ). 


Problem 7.6 


Example 7.2 defines ho(n) for n = 0, 1, 2, . . . , 7 to be about —0.01, 0.03, 0.03, —0.19, 
—0.03, 0.63, 0.72, 0.23. Using Eq. (7.1-23) with 2K = 8, we can write 

9o (7 — n) = ho(n) 

9i(n) = ( — l) n fi , o(7 — n). 

Thus gain) is time-reversed ho(n), or 0.23, 0.72, 0.63, -0.03,-0.19, 0.03, 0.03, —0.01. 
In addtion, gi(n) is a time-reversed and modulated copy of go(n); that is, —0.01, 
-0.03,0.03, 0.19, -0.03, -0.63, 0.72, -0.23. 

To numerically prove the orthonormality of the filters, let m = 0 in Eq. (7.1-22): 

( 9 i{n)gj{n )) = 6{i - j) with i,j = {0, 1}. 

Iterating over i and j we get 
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= H#i( n ) 

n n 

= 1 

X^oMsiM = °- 

n 

Substitution of the filter coefficient values into these two equations yields: 
J 29 o { n ) 9 i ( n ) = ( 0 . 23 )(— 0 . 01 ) + ( 0 . 72 )(— 0 . 03 ) + ( 0 . 63 )( 0 . 03 ) + 

n 

(— 0 . 03 )( 0 . 19 ) + (— 0 . 19 )(— 0 . 03 ) + ( 0 . 03 )(- 0 . 63 ) + 
( 0 . 03 )( 0 . 72 ) + (— 0 . 01 )(— 0 . 23 ) 

= 0 

^■9o( n ) = ^Z9i( n ) 

n n 

= (± 0 . 23) 2 + ( 0 . 72) 2 + (±. 63) 2 + (- 0 . 03) 2 + (± 0 . 19) 2 + 
( 0 . 03) 2 + (± 0 . 03) 2 + (- 0 . 01) 2 
= 1 . 


Problem 7.7 


Reconstruction is performed by reversing the decomposition process; that is, by replac- 
ing the downsamplers with upsamplers and the analysis filters by their synthesis filter 
counterparts, as shown in Fig. P7.7. 



Columns 


Figure P7.7 
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Problem 7.8 


Problem 7.9 


The Haar transform matrix for N = 8 is 


H 8 = 


1 

vf 


111 1111 1 

11 1 1 -1 -1 -1 -1 

72 72 -72 -72 0 0 0 0 

0 0 0 0 72 72 -72 -72 

2 -2 0 0000 0 

002 -2 000 0 

000 02 -2 0 0 

000 0002 -2 


(a) Equation (7.1-28) defines the 2x2 Haar transformation matrix as 


H 2 = 


1 

72 


1 1 

1 -1 


Then, using Eq. (7.1-24), we get 


T 


HFH 


( 1 V 

’ 1 1 


3 -1 " 


’ 1 1 

\V2) 

1 -1 


6 2 


1 -1 


5 4 
-3 0 


(b) First, compute 


Hi 


a b 
c d 


so that 


a b 

1 

’ 1 

1 


" 1 0 " 

c d 

72 

1 

-1 


0 1 


Solving this matrix equation yields 


H 


-l 

2 


1 

72 

h 2 . 


1 1 
1 -1 


Thus, 
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F 


r'Tr 1 



’ 1 

1 


1 

LO 
1 


’ 1 1 

1 

1 

I— 1 

1 


1 

1 

oo 

o 


1 -1 


3 -1 
6 2 


Problem 7.10 


(a) The basis is orthonormal and the coefficients are computed by the vector equivalent 


of Eq. (7.2-5): 


SO, 


1 

1 

’ 3 " 


x/2 . 

2 


572 


2 


1 

1 

’ 3 " 

. V2 

V2 . 

2 


72 

2 


5^2 72 

— + — <Pi 


572 


1 

1 

-4 

1 

1 

- 72 _ 


- “72 _ 


3 

2 


(b) The basis is biorthonormal and the coefficients are computed by the vector equivalent 
of Eq. (7.2-3): 


a 0 


a i 


1 

0 

2 



3 

2 


1 


3 

2 


so. 
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ip 0 + 2 <pi = 

(c) The basis is overcomplete and the coefficients are computed by the vector equivalent 
of Eq. (7.2-3): 





Problem 7.11 

As can be seen in Fig. P7.ll, scaling function ip 0 0 (a:) cannot be written as a sum of 
double resolution copies of itself. Note the gap between c p 1 0 (a:) and ip 1 1 (a:). 
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< Po,o< x ) = < PW 


1 


0 


0 1/4 1/2 3/4 1 


<Pj 0 (x)=t/2ip(2x) 



9i,iW = V2(p(2jc- 1 ) 



Figure P7.ll 


Problem 7.12 


Substituting j = 3 into Eq. (7.2-13) we get 

V 3 = Span{y ' 3 fe (a:)} 

k 

= Span{2 3 / 2 ip(2 3 x — k )} 

k 

= Span{ 2v / 2(p(8a’ — k)}. 

k 

Using the Haar scaling function in Eq. (7.2-14) we get the results shown in Fig. P7.12. 



Figure P7.12 
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Problem 7.13 

From Eq. (7.2-19) we find that 

^3,3 (*) = 2 3 / 2 ^(2 3 a. 3) 

= 2^2p{8x - 3) 

and using the Haar wavelet function definition from Eq. (7.2-30), obtain the plot shown 
in Fig. P7.13. 

To express ip 3 :i (x) as a function of scaling functions, we employ Eq. (7.2-28) and the 
Elaar wavelet vector defined in Example 7.6 — that is, h^( 0) = 1 / x/2 and h^{ 1) = 
— 1/ \f2. Thus we get 

ip{x) = 5> ■tj,(n)y/2ip(2x — n) 

n 

so that 

ip(8x — 3) = h^,(n)V 2ip(2[8x — 3] — n) 

-1 

71 

= tp( 16x — 6) — ip( 16x — 7). 

Then, since p 3 3 = 2\/2'ip(8x — 3), 

ip 33 = 2V2tp(8x — 3) 

= 2\/2ip(16x - 6) - 2\/2ip(16x - 7). 

v| t y pc) = 2^\j/(8x-3) 

2^2 
0 

• 2^2 

0 3/8 1/2 1 

Figure P7.13 

Problem 7.14 

Using Eq. (7.2-22), 



= ^7Ml6a- - 6) + ^ 


^ V2y(l&x - 7) 
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= V 2 ®W 2 
— V\ © Wi © W 2 
== Vo © Wo © Wi © W 2 . 

The scaling and wavelet functions are plotted in Fig. P7.14. 



Figure P7.14 
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(a) Since M = 4, J = 2, and j 0 = 1, the summations in Eqs. (7.3-5) through (7.3-7) are 
performed over x = 0, 1, 2, 3, j = 1, and k = 0,1. Using Haar functions and assuming 
that they are distributed over the range of the input sequence, we get 

w v ( 1,0) = i [/( 0Ko(0) + /( lKo(l) + /(2)^i,o(2) + /(3)^,o(3)] 

= \ [(1)(V2) + (4)(v / 2) + (-3)(0) + (0) (0)] = ^ 

w v ( 1,1) = ^[/(0Ki(0) +m<p 1 , 1 (l) + /(2V 1 , 1 (2) + /(3)^ 1i1 (3)] 

= ^ [(1)(0) + (4)(0) + (— 3)(\/2) + (0)(V2)] = ^ 

W*(1,0) = i [/(0)V’ 1 ,o(0) + /(l)^i,o(l) + /(2)^i, 0 (2) + /(3)V’i,o(3)] 

= i [(l)(v / 2) + (4)(-V2) + (— 3)(0) + (0) (0)] = ^ 

w^(l,l) = | [/(0)^ 1; i(0) + /(l)^i,i(l) + /(2)^i,i(2) + /(3)^ 1;1 (3)] 

= \ [(1)(0) + (4) (0) + (~3)(y/2) + (0)(-v / 2)] = ^ 
so that the DWT is {5^2/2, ^3^2/2, -3^2/2, -3^2/2}. 

(b) Using Eq. (7.3-7), 

f(x) = ^[W ¥ ,(l,0)<^ 1)0 (a:) + W<p(l, l)ip^\(x) + 

W v ,(l,0)^ 1)0 (a;) + W 4 ,( 1, l)^i, i(a?)] 

which, with x = 1, becomes 

/(!) = ^[(5)(v / 2)-(-(-3)(0) + (-3)(v / 2) + (-3)(0)' 

2(v/2) 2 _ x 
4 

Problem 7.17 


Intuitively, the continuous wavelet transform (CWT) calculates a “resemblance index” 
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between the signal and the wavelet at various scales and translations. When the index is 
large, the resemblance is strong; else it is weak. Thus, if a function is similar to itself 
at different scales, the resemblance index will be similar at different scales. The CWT 
coefficient values (the index) will have a characteristic pattern. As a result, we can say 
that the function whose CWT is shown is self-similar — like a fractal signal. 


Problem 7.18 

(a) The scale and translation parameters are continuous, which leads to the overcom- 
pleteness of the transform. 

(b) The DWT is a better choice when we need a space saving representation that is 
sufficient for reconstruction of the original function or image. The CWT is often easier 
to interpret because the built-in redundancy tends to reinforce traits of the function or 
image. For example, see the self- similarity of Problem 7.18. 

Problem 7.19 


The filter bank is the first bank in Fig. (7.17), as shown in Fig. P7.19: 


»<p (2. n) =/(n) 

- {1,4, -3, 0} 


l-l/C . -3/C. 7/C. -3/C. 0) 



(iC.SC, 1C.-3C,0| 


Figure P7.19 


Problem 7.20 


The complexity is determined by the number of coefficients in the scaling and wavelet 
vectors — that is, by n in Eqs. (7.2-18) and (72-28). This defines the number of taps in 
filters (— n), h v (— n), (n), and h v (n). 
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Problem 7.21 


(a) Input ip(n) = {1, 1, 1, 1, 1, 1, 1, 1} = ip 0 g (n) for a three-scale wavelet transform 
with Haar scaling and wavelet functions. Since wavelet transform coefficients measure 
the similarity of the input to the basis functions, the resulting transform is 

{W v (0,0), W^(0, 0), W^( 1,0), W^(l, 1), W^{ 2,0), W v ,( 2, 1), W^( 2, 2) 

W^( 2,3)} = 12^2,0,0,0,0,0,0,0} 

The W v ( 0, 0) term can be computed using Eq. (7.3-5) with jo = k = 0. 

(b) Using the same reasoning as in part (a), the transform is (0, 2\/2, 0, 0, 0, 0, 0, 0}. 


(c) For the given transform, I-U,/, (2, 2) = B and all other transform coefficients are 0. 
Thus, the input must be proportional to ip 2 2 (a0- The input sequence must be of the form 
(0, 0, 0, 0, C, —C, 0, 0} for some C. To determine C, use Eq. (7.3-6) to write 

W*(2,2) = -^{/(0)^ 2 , 2 (0) + /(1)^2, 2 (1) + /(2)^, 2 (2) + /(3)^ 2 , 2 (3) + 

/( 4)^2, 2 (4) + /( 5)^2, 2 (5) + /(6)^ 2i2 (6) + /( 7)^2, 2(7)} 

= -^{(0)(0) + (0)(0) -I- (0)(0) + (0)(0) + (C)(2) + (— C)(— 2) + 

( 0 ) ( 0 ) + ( 0 )( 0 )} 

1 dr 

- vi l2C+2C} =vs = Via 

Because this coefficient is known to have the value B, we have that \/2C = B or 


Thus, the input sequence is (0, 0, 0, 0, -s/275/2, — -X/2-S/2, 0, 0}. To check the result 
substitute these values into Eq. (7.3-6): 


Wd 2,2) = -^{(0)(0) + (0)(0) + (0)(0) + (0)(0) + (^B)(2) + 
(-^B)(-2) + (0)(0) + (0)(0)} 

= d_{V2B + V2B} 

v8 
= B. 


Problem 7.22 


They are both multi-resolution representations that employ a single reduced-resolution 
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approximation image and a series of “difference” images. For the FWT, these “differ- 
ence” images are the transform detail coefficients; for the pyramid, they are the predic- 
tion residuals. 

To construct the approximation pyramid that corresponds to the transform in Fig. 7.8(a), 
we will use the FWT~ X 2-d synthesis bank of Fig. 7.22(c). First, place the 64 x 64 ap- 
proximation “coefficients” from Fig. 7.8(a) at the top of the pyramid being constructed. 
Then use it, along with 64 x 64 horizontal, vertical, and diagonal detail coefficients 
from the upper-left of Fig. 7.8(a), to drive the filter bank inputs in Fig. 7.22(c). The 
output will be a 128 x 128 approximation of the original image and should be used as 
the next level of the approximation pyramid. The 128 x 128 approximation is then used 
with the three 128 x 128 detail coefficient images in the upper 1/4 of the transform in 
Fig. 7.8(a) to drive the synthesis filter bank in Fig. 7.22(c) a second time — producing 
a 256 x 256 approximation that is placed as the next level of the approximation pyra- 
mid. This process is then repeated a third time to recover the 512 x 512 original image, 
which is placed at the bottom of the approximation pyramid. Thus, the approximation 
pyramid would have 4 levels. 


Problem 7.23 


One pass through the FWT 2-d filter bank of Fig. 7.22(a) is all that is required (see Fig. 
P7.23): 


W(p(\, m, ri) 



w 

- {-1/V2, 1/V2} 2 } 

Each Row Columns 
(along ri) 



l \<fl , 1V2} 


Each Row Columns 



Each Rows 
Column 



Each Rows 
Column 


Wy( 0 , 0 . 0A 

=[0] I 


W\ jko, 0, 0) 
=[4] 


0, o) 

=1-3] 



Ordered per 
Fig. 7.22(b) 


^(p(0,0,0) 

=[5] J 


Figure P7.23 
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Problem 7.24 

As can be seen in the sequence of images that are shown, the DWT is not shift in- 
variant. If the input is shifted, the transform changes. Since all original images in the 
problem are 128 x 128, they become the W v (7,m,n) inputs for the FWT computa- 
tion process. The filter bank of Fig. 7.22(a) can be used with 7 + 1 = 7. For a single 
scale transform, transform coefficients W v (6,m,n) and W^(6,m,n) for i = H.V, D 
are generated. With Haar wavelets, the transformation process subdivides the image into 
non-overlapping 2x2 blocks and computes 2-point averages and differences (per the 
scaling and wavelet vectors). Thus, there are no horizontal, vertical, or diagonal detail 
coefficients in the first two transforms shown; the input images are constant in all 2 x 2 
blocks (so all differences are 0). If the original image is shifted by 1 pixel, detail coef- 
ficients are generated since there are then 2x2 areas that are not constant. This is the 
case in the third transform shown. 

Problem 7.25 

The table is completed as shown in Fig. P7.25. 



Figure P7.25 

The functions are determined using Eqs. (7.2-18) and (7.2-28) with the Haar scaling and 
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wavelet vectors from Examples 7.5 and 7.6: 

ip( x) = y(2x) + tp(2x — 1) 
ip(x) = <p(2x) — ip( 2x — 1). 


Problem 7.26 


(a) The analysis tree is shown in Fig. P7. 26(a): 

(b) The corresponding frequency spectrum is shown in Fig. P7. 26(b): 



(a) 


w J-\fiA w J-\tfH w J-\JiV 


(Overt 



a= Vj. 2 
b = Wj H 2 
c =Wj_2 
d =wf_2 

x=»ri?M 
z = wf. ur 

W = wf_i,D 
1 = Wj? XJU 

m =Wj?\jjh 

D 

n - Wj. liH y 

o-Wj_ 1MD 


Figure P7.26 


Problem 7.27 


First use the entropy measure to find the starting value for the input sequence, which is 

7 

E{f{n)} = / 2 ( n ) ln [/ 2 ( n )] = 2.7726. 

n = 0 
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Then perform an iteration of the FWT and compute the entropy of the generated approx- 
imation and detail coefficients. They are 2.0794 and 0, respectively. Since their sum is 
less than the starting entropy of 2.7726, we will use the decomposition. 

Because the detail entropy is 0, no further decomposition of the detail is warranted. 
Thus, we perform another FWT iteration on the approximation to see if it should be 
decomposed again. This process is then repeated until no further decompositions are 
called for. The resulting optimal tree is shown in Fig. VI .21: 


2.7726 



0 0 

Figure P7.27 


http://librosysolucionarios.net 



8 Problem Solutions 


Problem 8.1 

(a) A histogram equalized image (in theory) has a gray level distribution which is uni- 
form. That is, all gray levels are equally probable. Eq. (8.1-4) thus becomes 

l 2"- 1 

Lavg = 7^7 XI ( rfc ) 

fc =0 

where 1 / 2n is the probability of occurrence of any gray level. Since all levels are equally 
probable, there is no advantage to assigning any particular gray level fewer bits than any 
other. Thus, we assign each the fewest possible bits required to cover the 2 n levels. 
This, of course is n bits and L avg becomes n bits also: 

1 2 71 - 1 

L avg = 7^- XX ( n ) 

k = 0 

= t^(2 n )n 

2 n y 

= n. 

(b) Since interpixel redundancy is associated with the spatial arrangement of the gray 
levels in the image, it is possible for a histogram equalized image to contain a high level 
of interpixel redundancy - or none at all. 

Problem 8.2 


(a) A single line of raw data contains n\ = 2 n bits. The maximum run length would be 
2 n and thus require n bits for representation. The starting coordinate of each run also 
requires n bits since it may be arbitrarily located within the 2” pixel line. Since a run 
length of 0 can not occur and the run-length pair (0, 0) is used to signal the start of each 
new line - an additional 2 n bits are required per line. Thus, the total number of bits 
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required to code any scan line is 

n 2 = 2 n + N avg (n + n) 

= 2n (1 + N avg ) 


where N avg is the average number of run-length pairs on a line. To achieve some level 
of compression, Cr must be greater than 1 . So, 


Cr = 


ni 

n 2 


2 n 

2 n (1 + N avg ) 


> 1 


and 


(b) For n = 10, N avg 


2n-l 

N aV g ^ 1 • 

n 

must be less than 50.2 run-length pairs per line. 


Problem 8.3 


Table P8.3 shows the data, its 6-bit code, the IGS sum for each step, the actual IGS 3 -bit 
code and its equivalent decoded value, the error between the decoded IGS value and the 
input values, and the squared error. 


Table P8.3 


Data 

6-bit Code 

Sum 

IGS Code 

Decoded IGS 

Error 

Sq. Error 

12 

001100 

000000 

001100 

001 

8 

4 

16 

12 

001100 

010000 

010 

16 

-4 

16 

13 

001101 

001101 

001 

8 

5 

25 

13 

001101 

010010 

010 

16 

-3 

9 

10 

001010 

001100 

001 

8 

2 

4 

13 

001101 

010001 

010 

16 

-3 

9 

57 

111001 

111001 

111 

56 

1 

1 

54 

110110 

110111 

110 

48 

6 

36 


Problem 8.4 


The average square error is the sum of the last column of the table in Problem 8.3 divided 
by 8, the number of data points. This computation yields 116/8 or 14.5. The rms error 
is then 3.81, the square root of 14.5. The squared signal value (i.e., 6400) is obtained by 
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summing the squares of column 5 of the table. The mis signal-to-noise ratio is then 


SNR rms = - 7.43 


Problem 8.5 


(a) For the first value of the table (i.e., 01 10), substitution into Eq. (8.2-1 ) gives: 

hi — 63 © (>2 © ^0 = 0 © 1 © 0 — 1 
h 2 = 63©6i®6o = 0©1©0^1 
he = &3 = 0 

hi = &2©^i©( , o 1® 1©0 = 0 

he = b-2 = 1 
h& = bi = 1 
h-j = be = 0. 

Thus, the encoded value is 1100110. The remaining values of Table 8.2 are treated 
similarly. The resulting code words are 001 1001, 11 10000, and 1111111, respectively. 

(b) For 1 1001 1 1, construct the following three bit odd parity word: 

Ci = hi © he © he © hj = 1©0©1©1 = 1 

C 2 = /t-2 © he © he © h? = 1©0©1©1 = 1 

C4 = h 4 © he © he © /17 = 0©1©1©1 = 1. 

A parity word of III 2 indicates that bit 7 is in error. The correctly decoded binary value 

is OIIO 2 . In a similar manner, the parity words for 1100110 and 1100010 are 000 and 
101, respectively. The decoded values are identical and are 01 10. 


Problem 8.6 


The conversion factors are computed using the logarithmic relationship 

ioga X = l0 S& X ' 

log 6 a 

Thus, 1 Flartley = 3.3219 bits and 1 nat = 1.4427 bits. 


Problem 8.7 


Let the set of source symbols be {«i, 02 , ..., a q } with probabilities 
z = [P(ai),P(a 2 ),...,P(a q )] T . 
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Then, using Eq. (8.3-3) and the fact that the sum of all P (a*) is 1, we get 

9 9 

logg-iT(z) = ^P(a i )logg + ^P(a i )logP(a i ) 

2= 1 2= 1 

<7 

= 'E,P(a i )togqP{a i ). 

i—1 

Using the log relationship from Problem 8 . 6 , this becomes 

= log e ^2 p ( a i ) ln 9-P (°i ) • 

2 = 1 

Then, multiplying the inequality In# < x — 1 by -1 to get lnl/x > 1 — x and applying 
it to this last result, 

q 

log q-H (z) > loge^P/o,) 

2 = 1 

> loge ^P (a* 

_2 = 1 

> log e [1 - 1 ] 

> 0 

so that 

logg >H( z). 

Therefore, H (z) is always less than, or equal to, log < 7 . Furthermore, in view of the 
equality condition (x - 1 ) for In 1 /x > 1 — x, which was introduced at only one point 
in the above derivation, we will have strict equality if and only if P(a.i) ~ 1 / q for all i. 

Problem 8.8 


qP (a»)J 

1 p (dj) 

q hi p< ^ ai \ 


The source symbol probabilities are taken directly from z and are P(o = 0) = 0.75 and 
P(o = 1) = 0.25. Likewise, the elements of Q are the forward transition probabilities 
P(b = 0|o = 0) = 2/3, P{b = 0|a = 1) = 1/10, P(b = 1 | a = 0) = 1/3, and 
P(b = l|a = 1) = 9/10. The matrix multiplication of Eq. (8.3-6) yields the output 


probabilities 


v = Qz 


2 J_ 

3 10 

I A 

3 10 


3 

4 
1 
4 


21 

40 

19 

40 


Thus, P(b = 0) = 21/40 and P(b = 1) = 19/40. The conditional input probabilities 
are computed using Bayes’ formula 

P(a \b k ) = P ( b k\ a j) p ( a j) 

[ A ’ P(b k ) ' 

Thus, P(a = 0|6 = 0) = 20/21, P(a = 0|6 = 1) = 10/19, P(o = 1|6 = 0) = 1/21, 
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and P(a = 1|6 = 1) = 9/19. Finally, the joint probabilities are computed using 

P(aj,b k ) = P (a,j) P(b k \a.j) 

which yields P(a = 0,6 = 0) = 1/2, P(a = 0, b = 1) = 1/4, P(o = 1,6 = 0) = 
1/40, and P(a =1,6=1) = 9/40. 


Problem 8.9 

(a) Substituting the given values of pt, s and p, into the binary entropy function derived 
in the example, the average information or entropy of the source is 0.81 1 bits/symbol. 

(b) The equivocation or average entropy of the source given that the output has been 
observed (using Eq. 8.3-9) is 0.75 bits/symbol. Thus, the decrease in uncertainty is 
0.061 bits/symbol. 

(c) It is the mutual information J(z,v) of the system and is less than the capacity of 
the channel, which is, in accordance with the equation derived in the example, 0.0817 
bits/symbol. 


Problem 8.10 

(a) The proof proceeds by substituting the elements of Q into Eq. (8.3-13) and simpli- 
fying. The source probabilities are left as variables during the simplification. 

C = max z [/[z,v]] 

= max, ZU Ef=i P (aj ) q kj log E/=i ^ a . )gfei 
= max, [ELi P M Qk i log rUi 9 p (aj)qki 

+ £2=1 P (o2) q k2 log rL J 2 {aj)qhl _ 

= max, [P (oi) ((1 - p) log P(a \^ 0) + /31og P{a ,)(i-p) + °) 

+ P (o 2 ) ^0 + (3 log P{a J (1 _ 0) + (1 - 0) log 
= max, P(ai) ((1-/3) log p^j + ^log^p^y) 

+ P (as) (/31og + (1 - P) log p^y) 

= max, [-P(ai) ((1 - /3)logP(ai) + /31og2P(ai)) 

- P (a 2 ) (/? log 2P (a 2 ) + (1 - /?) log P (a 2 ))] 
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= max, [-P (ai) ((1 - 0) log P (ai) + //log 2 + f3 log P (ai)) 

- P(a 2 ) (/? log 2 + /3 log P(a 2 ) + (1 - /?) log P(a 2 ))] 

= max, [-P(ai) (logP(ai) + /31og2) - P(a 2 ) (/31og 2 + logP (a 2 ))] 

= max, [— P (ai) logP (ai) - P(a 2 ) log P (a 2 ) - P(ai)/3 log 2) 

- P(a 2 )/3 log 2], 

Noting that the first two terms of this sum are the entropy of the source and factoring out 
the common factor in the last two terms, we get 

C = max [H (z) — (P (ai) + P (a 2 )) (3 log 2] . 

Z 

Since the sum of the source probabilities is 1 and the maximum entropy of a binary 
source is also 1 with both symbols equally probable, this reduces to 

C = 1-/3. 

(b) Substituting 0.5 into the above equation, the capacity of the erasure channel is 0.5. 
Substituting 0.125 into the equation for the capacity of a BSC given in Section 8.3.2, we 
find that its capacity is 0.456. Thus, the binary erasure channel with a higher probability 
of error has a larger capacity to transfer information. 


Problem 8.11 


(a) The plot is shown in Fig. P8.1 1. 

(b) A,, ax is a 2 . 

(c) If we wish to code the source in this example so that the maximum average encoding- 
decoding distortion D is 0.75a 2 , we first evaluate R(D ) for D = 0.75 a 2 . Since 
P(0.75a 2 ) = 0.21, we know that at least 0.21 code bits per source symbol must be 
used to achieve the fidelity objective. Thus, this is the maximum possible information 
compression under this criterion. 


Problem 8.12 


(a) There are two unique codes. 

(b) The codes are: (1)0, 11, 10 and (2) 1, 00, 01. The codes are complements of one 
another. They are constructed by following the Huffman procedure for three symbols of 
arbitrary probability. 
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Distortion D/a 2 

Figure P8.ll 


Problem 8.13 


(a) The entropy is computed using Eq. (8.3-3) and is 2.6508 bits/symbol. 

(b) The specific binary codes assigned to each gray level may vary depending upon 
the arbitrary selection of Is and Os assigned at each step of the coding algorithm. The 
number of bits used for each gray level, however, should be the same for all versions 
constructed. The construction of Code 2 in Table 8.1 proceeds as follows: 

Step 1 : Arrange according to symbol probabilities from left to right, as shown in Fig. 
P8. 13(a). 

Step 2: Assign code words based on the ordered probabilities from right to left, as shown 
in Fig. P8. 13(b). 

Step 3: The codes associated with each gray level are read at the left of the diagram. 

(c) - (f) The remaining codes and their average lengths, which are computed using (8.1- 
4), are shown in Table P8.13. Note that two Huffman shift codes are listed, one of 
which is the best. In generating these codes, the sum of probabilities 4-7 were used 
as the probability of the shift up symbol. The sum is 0.19, which is equivalent to the 
probability of symbol Vq. Thus, the two codes shown differ by the ordering of r 0 and 
the shift symbol during the Huffman coding process. 


http://librosysolucionarios.net 




112 Chapter 8 Problem Solutions 


Table P8.13 


Vk 

Pr in) 

Di -code 

2-bit Shift 

H. Shift 1 

H. Shift 2 

Huffman 

o 

II 

o 

0.19 

COCO 

10 

11 

000 

11 

ri = 1/7 

0.25 

CO 

00 

01 

01 

01 

r 2 = 2/7 

0.21 

Cl 

01 

10 

10 

10 

-s 

CO 

II 

CO 

-a 

0.16 

C0C1 

1100 

001 

001 

001 

II 

-a 

0.08 

C1C0 

1101 

00001 

1101 

0001 

r 5 = 5/7 

0.06 

C1C1 

1110 

00010 

1110 

00001 

r 6 = 6/7 

0.03 

cococo 

111100 

00011 

11000 

000001 

r 7 = l 

0.02 

C0C0C1 

111101 

000001 

11001 

000000 

Length 


3.18 

2.8 

2.75 

2.78 

2.7 


The entropy of the source is/ 1 / = 2.65 from Eq. (8.3-3) and the probabilities from 
column 2. 


Problem 8.14 


The arithmetic decoding process is the reverse of the encoding procedure. Start by 
dividing the [0, 1 ) interval according to the symbol probabilities. This is shown in Table 
P8.14. The decoder immediately knows the message 0.23355 begins with an “e”, since 
the coded message lies in the interval [0.2, 0.5). This makes it clear that the second 
symbol is an “a”, which narrows the interval to [0.2, 0.26). To further see this, divide 
the interval [0.2, 0.5) according to the symbol probabilities. Proceeding like this, which 
is the same procedure used to code the message, we get “eaii!”. 


Table P8.14 


Symbol 

Probability 

Range 

a 

0.2 

[0.0, 0.2) 

e 

0.3 

[0.2, 0.5) 

i 

0.1 

[0.5, 0.6) 

0 

0.2 

[0.6, 0.8) 

u 

0.1 

[0.8, 0.9) 

I 

0.1 

[0.9, 1.0) 
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000000 ' 


Figure P8.13 
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Problem 8.15 


Assume that the first 256 codes in the starting dictionary are the ASCII codes. If you 
assume 7-bit ASCII, the first 128 locations are all that are needed. In either case, the 
ASCII ”a” corresponds to location 97. The coding proceeds as shown in Table P8.15. 


Table P8.15 


Recognized 

Character 

Output 

Diet. Address 

Diet. Entry 


a 




a 

a 

97 

256 

aa 

a 

a 




aa 

a 

256 

257 

aaa 

a 

a 




aa 

a 




aaa 

a 

257 

258 

aaaa 

a 

a 




aa 

a 




aaa 

a 




aaaa 

a 

258 

259 

aaaaa 

a 


97 




Problem 8.16 


The input to the LZW decoding algorithm for the example in Example 8. 12 is 
39 39 126 126 256 258 260 259 257 126 

The starting dictionary, to be consistent with the coding itself, contains 512 locations- 
with the first 256 corresponding to gray level values 0 through 255. The decoding algo- 
rithm begins by getting the first encoded value, outputting the corresponding value from 
the dictionary, and setting the ’’recognized sequence” to the first value. For each addi- 
tional encoded value, we (T ) output the dictionary entry for the pixel value(s), (2) add a 
new dictionary entry whose content is the ’’recognized sequence” plus the first element 
of the encoded value being processed, and ( 3) set the ’’recognized sequence” to the en- 
coded value being processed. For the encoded output in Example 8.12, the sequence of 
operations is as shown in Table P8.16. 

Note, for example, in row 5 of the table that the new dictionary entry for location 259 
is 126-39, the concatenation of the currently recognized sequence, 126, and the first 
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element of the encoded value being processed-the 39 from the 39-39 entry in dictionary 
location 256. The output is then read from the third column of the table to yield 


39 

39 

126 

126 

39 

39 

126 

126 

39 

39 

126 

126 

39 

39 

126 

126 


where it is assumed that the decoder knows or is given the size of the image that was 
recieved. Note that the dictionary is generated as the decoding is carried out. 


Table P8.16 


Recognized 

Encoded Value 

Pixels 

Diet. Address 

Diet. Entry 


39 

39 



39 

39 

39 

256 

39-39 

39 

126 

126 

257 

39-126 

126 

126 

126 

258 

126-126 

126 

256 

39-39 

259 

126-39 

256 

258 

126-126 

260 

39-39-126 

258 

260 

39-39-126 

261 

126-126-39 

260 

259 

126-39 

262 

39-39-126-126 

259 

257 

39-126 

263 

126-39-39 

257 

126 

126 

264 

39-126-126 


Problem 8.17 


(a) Using Eq. (8.4-3), form Table P8.17. 


Table P8.17 


Binary Gray Code 


0000 

0000 

0001 

0001 

0010 

0011 

0011 

0010 

0100 

0110 

0101 

0111 

0110 

0101 

0111 

0100 


Binary Gray Code 


1000 

1100 

1001 

1101 

1010 

mi 

1011 

1110 

1100 

1010 

1101 

1011 

1110 

1001 

mi 

1000 
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(b) The procedure is to work from the most significant bit to the least significant bit 
using the equations: 

rtm— 1 9m— 1 

at = gi © a i+ 1 0 < i < m — 2. 

The decoded binary value is thus 0101100111010. 


Problem 8.18 

(a) Using the procedure described in Section 8.4.3, the decoded line is 

[W 1001 WWWWWW 0000 0010 WWW WWW] 
where W denotes four white pixels - i.e., 1111. 

(b) - (c) Establish the convention that sub-blocks are included in the code string from 
left to right. Then, using brackets to clarify the decomposition steps, we get 

1 [ [W 1001 WWWWWW] [0000 0010 WWWWWW]] 

1 [ 1 [W 1001 W W] [W W W W] ] [ 1 [0000 0010 W W] [W W W W] ] ] 

1 [ 1 [ 1 [ [W 1001] [W W] ] [ 0 ] ] [ 1 [ 1 [ [0000 0010] [W W] ] [ 0 ] ] ] 

1 [1 [1 [l[tU][1001]][0]][0]][l[l [1 [0000] [0010] ] [ 0 ] ] [ 0 ] ] ] 

1 [1 [ 1 [ 1 [ 0 ] [ 11001 ] ] [ 0 ] ] [ 0 ] ] [ 1 [1 [1 [ 10000 ] [ 10010 ] ] [ 0 ] ] [ 0 ] ] ] 

Thus, the encoded string is 111101100100111100001001000, which requires 27 bits. 
The first encoding required 28 bits. 

Problem 8.19 

(a) The motivation is clear from Fig. 8.17. The transition at c must somehow be tied 
to a particular transition on the previous line. Note that there is a closer white to black 
transition on the previous line to the right of c, but how would the decoder know to use 
it instead of the one to the left. Both are less than ec. The first similar transition past e 
establishes the convention to make this decision. 

(b) An alternate solution would be to include a special code which skips transitions on 
the previous line until you get to the closest one. 

Problem 8.20 


(a) Substituting p h = 0 into Eq. (8.5-12) and evaluating it to form the elements of R 
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and r, we get 


R 


(b) First form the inverse of R, 
R 1 = 


1 P 
P 1 

1 


and r = a 2 


P 

~2 


1 ~p 

-P 1 


(J 2 (l-p 2 ) 

Then, perform the matrix multiplication of Eq. (8.5-8): 
a = R -1 r =- 


a 2 

P(l-P 2 ) 


P 

<T 2 ( 1 — p 2 ) 

0 


0 


Thus, ol\ = p and a 2 = 0. 

(c) The variance is computed using Eq. (8.5-1 1): 


2 2 T 

cr„ = a — a r = 


r 


p 

P 0 



L J 


_ P _ 


a 2 (l-p 2 ). 


Problem 8.21 


The derivation proceeds by substituting the uniform probability function into Eqs. (8.5- 
20) - (8.5-22) and solving the resulting simultaneous equations with L = 4. Eq. (8.5-21) 
yields 

s 0 = 0 

Si = \ (t\ + ^2) 


s 2 = 00 . 

Substituting these values into the integrals defined by Eq. (8.5-20), we get two equations. 
The first is (assuming Si < A) 


(s — ti)p(s)ds = 0 


1 

2 A 


rj(ti+t2) 


(s-fi)ds = — -hs 


\ (fi + ^ 2 ) 
0 

0 


0 


(h + t 2 ) - 4fi (ti + t 2 ) 

(fi + ^ 2 ) (f 2 — 3fi) = 0 
fi = — i 2 and f 2 = 3ti. 

The first of these relations does not make sense since both ti and t 2 must be positive. 
The second relationship is a valid one. The second integral yields (noting that Si is less 
than A so the integral from A to oc is 0 by the definition of p(s)) 


1 

2 A 


)(tl+t2) 


(s -l,,)ds= — t 2 s 


A 


t 2 ) 


= 0 
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4 A 2 - 8 At 2 - (fi + f 2 ) 2 - 4f 2 (h + t 2 ) = 0. 


Substituting t 2 = 3f i from the first integral simplification into this result, we get 


8t? - 6At i + A 2 = 0 
[ft - f ] (8ft - 2A) = 0 
fi = f and ti = A_ 

Back substituting these values of t\, we find the corresponding t 2 and si values: 
t, 2 = ^ anc * S\= A for ft = ^ 
t 2 = ^ and si = |- for f i = j . 


Since Si = /I is not a real solution (the second integral equation would then be evaluated 
from A to A, yielding 0 or no equation), the solution is given by the second. That is, 

s 0 = 0 Si = f s 2 = oo 


ft 



Problem 8.22 


Following the procedure in the flow chart of Fig. 8.37, the proper code is 

0001 010 1 00110000110001 

where the spaces have been inserted for readability alone. The coding mode sequence is 
pass, vertical ( 1 left), vertical (directly below), horizontal (distances 3 and 4), and pass. 


Problem 8.23 


(a) - (b) Following the procedure outlined in Section 8.6.2, we obtain the results shown 
in Table P8.23. 


Problem 8.24 


Since the T1 transfer rate is 1.544 Mbit/sec, a 6 second transfer will provide 
(1.544 x 10 6 )(6 sec) = 9.264 x 10 6 bits 

of data. The initial approximation of the X-ray must contain no more than this number 

of bits. The required compression ratio is thus 

= 4096 x 4096 x 12 = 

9.264 x 10 6 

The JPEG transform coding approach of Section 6.6 can achieve this level of compres- 
sion and provide reasonably good reconstructions. At the X-ray encoder, the X-ray can 
be JPEG compressed using a normalization array that yields about a 25:1 compression. 
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While it is being transmitted over the T1 line to the remote viewing station, the encoder 
can decode the compressed JPEG data and identify the “differences” between the result- 
ing X-ray approximation and the original X-ray image. Since we wish to transmit these 
“differences” over a span of 1 minute with refinements every 5-6 seconds, there can be 
no more than 


to = 10 to 12 refinements. 

6 5 

If we assume that 12 refinements are made and that each refinement corresponds to 
the “differences” between one of the 12 bits in the original X-ray and the JPEG recon- 
structed approximation, then the compression that must be obtained per bit (to allow a 6 
second average transfer time for each bit) is 


4096 x 4096 x 1 

Gr = — — 1.81 

9.264 x 10 6 

where, as before, the bottom of the fraction is the number of bits that can be transmitted 
over a T1 line in 6 seconds. Thus, the “difference” data for each bit must be compressed 
by a factor just less than 2. One simple way to generate the “difference information” is to 
XOR the actual X-ray with the reconstructed JPEG approximation. The resulting binary 
image will contain a 1 in every bit position at which the approximation differs from the 
original. If the XOR result is transmitted one bit at a time beginning with the MSB 
and ending with the LSB, and each bit is compressed by an average factor of 1.81:1, 
we will achieve the performance that is required in the problem statement. To achieve 
an average error-free bit-plane compression of 1.81:1 (see Section 6.4), the XOR data 
can be Gray coded, run-length coded, and finally variable-length coded. A conceptual 
block diagram for both the encoder and decoder are given below. Note that the decoder 
computes the bit refinements by XORing the decoded XOR data with the reconstructed 
JPEG approximation. 


Table P8.23 


DC Coefficient Difference 

Two’s Complement Value 

Code 

-7 

1...1001 

00000 

-6 

1...1010 

00001 

-5 

1...1011 

00010 

-4 

1 — 1100 

00011 

4 

0...0100 

00100 

5 

0...0101 

00101 

6 

0...0110 

00110 

7 

0...0111 

00111 
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Figure P8.24 


Problem 8.25 

To demonstrate the equivalence of the lifting based approach and the traditional FWT 
filter bank method, we simply derive general expressions for one of the odd and even 
outputs of the lifting algorithm of Eq. (8.6-2). For example, the Y (0) output of step 4 
of the algorithm can be written as 

n(o) = y 2 (o) + «[r 3 (-i) + ^3(i)] 

= x (o) + /? [y (-i) + y, (i)] + s [y 3 (-i) + y 3 (i)] 

where the subscripts on the F’s have been added to identify the step of the lifting al- 
gorithm from which the value is generated. Continuing this substitution pattern from 
earlier steps of the algorithm until V) (0) is a function of X’s only, we get 
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Y (0) = [1 + 2 a/3 + 2 aS + + 27 b] X (0) 

+ [/? + 3/3-yS + 8} X (1) 

+ [a/3 + Aa(3^8 + a8 + 7 $] X (2) 

+ \m x ( 3 ) 

+ [afa6\ X (4) 

+ \P + 30i6 + 6\X(-l) 

+ [a/3 + Aa/3j8 + a8 + 7 $] X (—2) 

+ \(3j8\ X (-3) 

+ [afa6\ X (-4) . 

Thus, we can form the lowpass analysis filter coefficients shown in Table P8.25-1. 


Table P8.25-1 


Coefficient Index 

Expression 

Value 

±4 

a/3j8/K 

0.026748757 

±3 

py8/K 

-0.016864118 

±2 

(< a/3 + Aaf3'j8 + a8 + 71 5) / K 

-0.07822326 

±1 

{(3 + 307$ + 8) /K 

0.26686411 

0 

(1 + 2 a/3 + 2 a8 + Gaj3^8 + 27 8 ) / K 

0.60294901 


Here, the coefficient expressions are taken directly from our expansion of Y(0) and the 
division by K is in accordance with step 6 of Eq. (8.6-2). The coefficient values in 
column 3 are determined by substituting the values of a, (3, 7 , 6, and K from the text 
into the expressions of column 2. A similar derivation beginning with 

y 3 (i) = Vi (i) + 7 [>2 (o) + y 2 (2)] 


yields 


Y (1) = [a + 3 a /?7 + 8} X (0) 

+ [1 + 2/3 7 ] X ( 1 ) 

+ [a + 3 a /?7 + $] X (2) 


+ mX(3) 

+ [<*WX( 4) 

+ [Pi\ x (— 1 ) 

+ [a/3j\X(-2) 

from which we can obtain the highpass analysis filter coefficients shown in Table P8.25- 

2 
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Table P8.25-2 


Coefficient Index 

Expression 

Value 

-2 

—K (a/3j) 

-0.091271762 

-1 

~K{fi 7) 

0.057543525 

0 

— K ( a + 3a/?7 + 8) 

0.591271766 

1 

—K (1 + 2/fy) 

-1.115087053 

2 

— K ( a + 3a/?7 + 8) 

0.591271766 

3 

-K (07) 

0.057543525 

4 

— K (a/3 7 ) 

-0.091271762 


Problem 8.26 


From Eq. (8.6-5) and the problem statement, we get that 


^2 LL — Mo — 8 


£2 ll = eo + 2 — 2 = eo — 8. 

Substituting these values into Eq. (8.6-4), we find that for the 2 LL subband 

A 2 ll = 2 (8+0) " 8 1 1 + 


= 1.00390625. 


Elere, we have assumed an 8-bit image so that Ri, = 8. Likewise, using Eqs. (8.6-5), 
(8.6-4), and Fig. 8.46 (to find the analysis gain bits for each subband), we get 

A 2HH = 2 (8+2) " 8 [1 + ^t] = 4.015625 
A 2hl = A 2 lh = 2( 8 +!)- 8 [1 + 2?r] = 2.0078125 
Ai hh = 2( 8+2 )- 8 [1 + 2tt] = 4.015625 
A 1HL = A 1LH = 2( 8 +!)- 8 [1 + At] = 2.0078125. 


Problem 8.27 


The appropriate MPEG decoder is shown in Fig. P8.27. 


Encoded 

Block 


Encoded 

Motion 

Vector 



Image 

Block 


Figure P8.27 
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Problem 9.1 


(a) Converting a rectangular to a hexagonal grid basically requires that even and odd 
lines be displaced horizontally with respect to each other by one-half the horizontal 
distance between adjacent pixels (see the figure in the problem statement). Since in 
a rectangular grid there are no pixel values defined at the new locations, a rule must 
be specified for their creation. A simple approach is to double the image resolution in 
both dimensions by interpolation (see Section 2.4.5). Then, the appropriate 6-connected 
points are picked out of the expanded array. The resolution of the new image will be the 
same as the original (but the former will be slightly blurred due to interpolation). Figure 
P9.1(a) illustrates this approach. The black points are the original pixels and the white 
points are the new points created by interpolation. The squares are the image points 
picked for the hexagonal grid arrangement. 

(b) Rotations in a 6-neighbor arrangement are invariant to rotations in 60° increments. 


(c) Yes. Ambiguities arise when there is more than one path that can be followed from 
one 6-connected pixel to another. Figure P9.1(c) shows an example, in which the 6- 
connected points of interest are in black. 


0000000000 

o«o*o«o#o* 

o«o*o«o*o* 

0O0O0O0O0O 

o«o*o«o»o# 

• 0*[o]*0*[o]«0 

o*o*o*o*o* 

0O0O0O0O0O 


(a) 


□ □ □ ■ □ 

□ * — -w □ □ 

□ □Nan 

□ ’if ■ □ □ 

□ □ □ ■ □ 

(c) 


Figure P9.1 


http://iibrosysolucionarios.net 



124 Chapter 9 Problem Solutions 


Problem 9.2 


(a) The answer is shown shaded in Fig. P9.2. 

(b) With reference to the sets shown in the problem statement, the answers are, from left 
to right, 

(AnBnC)^ (Bnc); 
(4n5nC)U(dnC)U(dn B)\ and 
{Bn(du C) c } u{(dnC) - [(A nC)n(Bn C)]} . 



Figure P9.2 


Problem 9.3 


With reference to the discussion in Section 2.5.2, ^-connectivity is used to avoid multi- 
ple paths that are inherent in 8-connectivity. In one -pixel-thick, fully connected bound- 
aries, these multiple paths manifest themselves in the four basic patterns shown in Fig. 
P9.3. 

The solution to the problem is to use the hit-or-miss transform to detect the patterns 
and then to change the center pixel to 0, thus eliminating the multiple paths. A basic 
sequence of morphological steps to accomplish this is as follows: 

Xi = A ® B 1 

Yi = An XI 
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X 2 = >1 © B 2 
y 2 = Y 1 nx% 
x 3 = y 2 ®b 3 
y 3 = y 2 n xg 
x 4 = y 3 ®b 4 
y 4 = y 3 ni 4 c 

where A is the input image containing the boundary. 

(b) Only one pass is required. Application of the hit-or-miss transform using a given B‘ 
finds all instances of occurrence of the pattern described by that structuring element. 

(c) The order does matter. For example, consider the sequence of points shown in Fig. 
P9.3(c). and assume that we are traveling from left to right. If B 1 is applied first, 
point a will be deleted and point b will remain after application of all other structuring 
elements.. If, on the other hand, B 3 is applied first, point b will be deleted and point a 
will remain. Thus, we would end up with different (but of course, acceptable) m-paths. 




(c) 

Figure P9.3 


Problem 9.4 


See Fig. P9.4. Keep in mind that erosion is the set described by the origin of the 
structuring element, such that the structuring element is contained within the set being 
eroded. 
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A 


Figure P9.4 

Problem 9.5 

(a) Erosion is set intersection. The intersection of two convex sets is convex also. See 
Fig. P9.5 for solutions to parts (b) through (d). Keep in mind that the digital sets in 
question are the larger black dots. The lines are shown for convenience in visualizing 
what the continuous sets would be. In (b) the result of dilation is not convex because the 
center point is not in the set. In (c) we see that the lower right point is not connected to 
the others. In (d), it is clear that the two inner points are not in the set. 



(b) (c) (d) 

Figure P9.5 


Problem 9.6 

Refer to Fig. P9.6. The center of each structuring element is shown as a black dot. 
Solution (a) was obtained by eroding the original set (shown dashed) with the structuring 
element shown (note that the origin is at the bottom, right). Solution (b) was obtained 
by eroding the original set with the tall rectangular structuring element shown. Solution 


http://librosysolucionarios.net 





Problem 9.7 127 


(c) was obtained by first eroding the image shown down to two vertical lines using the 
rectangular structuring element; this result was then dilated with the circular structuring 
element. Solution (d) was obtained by first dilating the original set with the large disk 
shown. Then dilated image was then eroded with a disk of half the diameter of the disk 
used for dilation. 



Figure P9.6 


Problem 9.7 


The solutions to (a) through (d) are shown from top to bottom in Fig. P9.7. 


Problem 9.8 


(a) The dilated image will grow without bound, (b) A one-element set (i.e., a one-pixel 
image). 


Problem 9.9 


(a) The image will erode to one element, (b) The smallest set that contains the structuring 
element. 
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0 * n O 

B l B 2 B 3 B 4 

Figure P9.7 


Problem 9.10 

The approach is to prove that 

ja: £ Z 2 (B) x n A / 0| = {x £ Z 2 | a: = a + b for a £ A and b £ i?} . 

The elements of (B) x are of the form x — b for b £ B. The condition (B) x fl A 7 ^ 0 
implies that for some b £ B, x - b £ /T or x — b = a for some a £ A (note in the 
preceding equation that x = a + b ). Conversely, i f x = a + b for some a £ A and b £ B, 
then x — 6 = a or x — b £ A, which implies that (B) x D A ^ 0. 

Problem 9.11 

(a) Suppose that x £ A (B B. Then, for some a £ A and b £ B, x = a + b. Thus, 
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x £ (A) b and, therefore, x £ (J (A) b . On the other hand, suppose that x £ (J {A) b . 

beB beB 

Then, for some b £ B, x £ ( A) b . However, x £ (A) b implies that there exists an a £ A 

such that x - a + b. But, from the definition of dilation given in the problem statement, 

a £ A, b £ B, and x = a + b imply that x £ A © B. 


(b) Suppose that x £ 1J (A) b . Then, for some b £ B, x £ ( A) b . However, x £ (A) b 
beB 

implies that there exists an a £ A such that x ■■ a -[- b. But, i I' x = a + b for some a £ A 
and b £ B, then x—b = aorx — b£ A, which implies that x £ ( B) x . Now, 

suppose that x £ ( B) x n A 0 . The condition (B) x n A ^ 0 implies that for some 
b £ B, x — b £ Aor x — b = a (i.e., x = a + b ) for some a £ A. But, if x = a + b for 
some a £ A and b £ B, then x £ (A) b and, therefore, x £ |J (A) b . 

b£B 


Problem 9.12 


The proof, which consists of proving that 

{a; £ Z 2 | x + b £ A , for every b £ B} = {x £ Z 2 \ (B) x C A} , 
follows directly from the definition of translation because the set ( B) x has elements of 
the form x + b for b £ B. That is, x + b £ A for every b £ B implies that (B) x C A. 
Conversely, ( B) x C A implies that all elements of ( B) x are contained in A, or x+b £ A 
for every b £ B. 


Problem 9.13 


(a) Let x £ AQ B. Then, from the definition of erosion given in the problem statement, 
for every b £ B, x + b £ A. But, x + b £ A implies that x £ (^4)_ b • Thus, for every 
b £ B, x £ (A)_ b , which implies that x £ |"| (A)_ b . Suppose now that x £ n (A)_ b . 

b€B beB 

Then, for every b £ B, x £ (A')_ h . Thus, for every b £ B, x + b £ A which, from the 
definition of erosion, means that x £ AQ B. 

(b) Suppose that x £ A © B = |"| (A)_ b . Then, for every b £ B, x £ ( A)_ b , or 

beB 

x \ b £ A. But, as shown in Problem 9.12, a : + b £ A for every b £ B implies that 
(B) x C A, so that x £ AO B = {x £ Z 2 \ ( B) x C A } . Similarly, (B) x C A implies 
that all elements of (B) x are contained in A, or x + b £ A for every b £ B or, as in 
(a), x + b £ A implies that x £ (A)_ b . Thus, if for every b £ B, x £ (A)_ b , then 

^ n ( A )-b- 

beB 
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Problem 9.14 

Starting with the definition of closing, 

(. A*B) C = [(A®B)QB} C 
= ( A®B) C ®B 
= ( A C QB)®B 
= A c o B. 


Problem 9.15 

(a) Erosion of a set A by B is defined as the set of all values of translates, z, of B such 
that ( B) z is contained in A. If the origin of B is contained in B, then the set of points 
describing the erosion is simply all the possible locations of the origin of B such that 
( B ) 2 is contained in A. Then it follows from this interpretation (and the definition of 
erosion) that erosion of A by B is a subset of A. Similarly, dilation of a set C by B is 
the set of all locations of the origin of B such that the intersection of C and (. B) z is not 
empty. If the origin of B is contained in B, this implies that C is a subset of the dilation 
of C by B. Now, from Eq. (9.3-1 ), we know that A o B = (A © B) (I; B. Let C denote 
the erosion of A by B. It was already established that C is a subset of A. From the 
preceding discussion, we know also that C is a subset of the dilation of C by B. But C 
is a subset of A, so the opening of A by B (the erosion of A by B followed by a dilation 
of the result) is a subset of A. 

(b) From Eq. (9.3-3), 

CoB = \J{(B) z \(B) z CC} 

and 

D°B = {J{(B) Z \(B) Z CD}. 

Therefore, if C C D, it follows that C o B C D o B. 
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(c) From (a), (A o B) o B C (A o B). From the definition of opening, 

[AoB)oB = {[AoB)QB}®B 

= {p©B)eB]BB}®5 

= {[A®B)*B}®B 
D ( A®B)®B 
D AoB. 

But, the only way that {A o B) o B C [AoB) and [A o B) o B D [AoB) can hold 
is if [A o B) o B = [AoB). The next to last step in the preceding sequence follows 
from the fact that the closing of a set by another contains the original set [this is from 
Problem 9.16(a)]. 


Problem 9.16 


(a) From Problem 9.14, [A • B) c = A c o B, and. from Problem 9.15(a), it follows that 

[A»B) C =A c oB C A c . 

Taking the complement of both sides of this equation reverses the inclusion sign and we 
have that AC. [A* B), as desired. 

(b) From Problem 9.16(b), if D c C C c , then D c o B C C c o B where we used D c , C c , 
and B instead of C, D, and B. From Problem 9.15, [C • B) c = C c oB and [D • B) c = 
D c o B. Therefore, if D c C O' then (D • B) c C [C • B) c . Taking complements 
reverses the inclusion, so we have that if C C D, then [C • B) C [D • B), as desired. 


(c) Starting with the result of Problem 9.15, 
[A*B)*B = 


{(/oB)ob| c 

{(^oB)} C 

{(a.b) r 

(A.B). 


where the third step follows from Problem 9.15(c) and the fourth step follows from 
Problem 9.14. 
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Problem 9.17 


The solution is shown in Fig. P9.17. Although the images shown could be sketched 
by hand, they were done in MATLAB The size of the original is 647 x 624 pixels. 
A disk structuring element of radius 1 1 was used. This structuring element was just 
large enough to encompass all noise elements, as given in the problem statement. The 
images shown in Fig. P9.17 are: (a) erosion of the original, (b) dilation of the result, (c) 
another dilation, and finally (d) an erosion. The main points we are looking for from 
the student’s answer are: The first erosion (leftmost image) should take out all noise 
elements that do not touch the rectangle, should increase the size of the noise elements 
completely contained within the rectangle, and should decrease the size of the rectangle. 
If worked by hand, the student may or may not realize that some ’’imperfections” are left 
along the boundary of the object. We do not consider this an important issue because 
it is scale -dependent, and nothing is said in the problem statement about this. The first 
dilation (next image) should shrink the noise components that were increased in erosion, 
should increase the size of the rectangle, and should round the corners. The next dilation 
should eliminate the internal noise components completely and further increase the size 
of the rectangle. The final erosion (last image on the right) should then decrease the size 
of the rectangle. The rounded corners in the final answer are an important point that 
should be recognized by the student. 
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Problem 9.18 

It was possible to reconstruct the three large squares to their original size because they 
were not completely eroded and the geometry of the objects and structuring element 
was the same (i.e., they were squares). This also would have been true if the objects 
and structuring elements were rectangular. However, a complete reconstruction, for 
instance, by dilating a rectangle that was partially eroded by a circle, would not be 
possible. 

Problem 9.19 

(a) Select a one-pixel border around the image of the T, assuming that the resulting 
subimage is odd. let the origin be located at the horizontal/vertical midpoint of this 
subimage (if the dimensions were even, we could just as easily select any other point). 
The resulting of applying the hit-or-miss transform would be a single point where the 
two T’s were in perfect registration. The location of the point would be the same as the 
origin of the structuring element. 

(b) The hit-or-miss transform and (normalized) correlation are similar in the sense that 
they produce their maximum value at the location of a perfect match, and also in the 
mechanics of sliding the template (structuring element) past all locations in the image. 
Major differences are the lack of a complex conjugate in the hit-or-miss transform, and 
the fact that this transform produced a single nonzero binary value in this case, as op- 
posed to the multiple nonzero values produced by correlation of the two images. 

Problem 9.20 


The key difference between the Lake and the other two features is that the former forms 
a closed contour. Assuming that the shapes are processed one at a time, basic two-step 
approach for differentiating between the three shapes is as follows: 

Step 1. Apply an end-point detector to the object until convergence is achieved. If the 
result is not the empty set, the object is a Lake. Otherwise it is a Bay or a Line. 

Step 2. There are numerous ways to differentiate between a lake and a line. One of the 
simplest is to determine a line joining the two end points of the object. If the AND of 
the object and this line contains only two points, the figure is a Bay. Otherwise it is 
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a line segment. There are pathological cases in which this test will fail, and additional 
’’intelligence” needs to be built into the process, but these pathological cases become 
less probable with increasing resolution of the thinned figures. 


Problem 9.21 

(a) The entire image would be filled with l’s. (b) The background would be filled with 
l’s. (c) See Fig. P9.21. 


Figure P9.21 


Problem 9.22 


(a) With reference to the example shown in Fig. P9. 22(a), the boundary that results 
from using the structuring element in Fig. 9.15(c) generally forms an 8-connected path 
(leftmost figure), whereas the boundary resulting from the structuring element in Fig. 
9.13(b) forms a 4-connected path (rightmost figure). 


(b) Using a 3 x 3 structuring element of all l’s would introduce corner pixels into seg- 
ments characterized by diagonally-connected pixels. For example, square (2,2) in Fig. 
9.15(e) would be a 1 instead of a 0. That value of 1 would carry all the way to the final 
result in Fig. 9. 15(i). There would be other l’s introduced that would turn Fig. 9.15(i) 
into a much more distorted object. 
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Problem 9.23 


If spheres are allowed to touch, we can make the simplifying assumption that no spheres 
touch it in such a way that they create “pockets” of black points surrounded by all white 
or surrounded by all white and part of the boundary of the image. This situation requires 
additional preprocessing, as discussed below. With these simplification in mind, the 
problem reduces first to determining which points are background (black) points. To 
do this, we pick a black point on the boundary of the image and find all black points 
connected to it using a connected component algorithm (Section 9.5.3). These connected 
components are labels with a value different from 1 or 0. The remaining black points 
are interior to spheres. We can fill all spheres with white by applying the region filling 
algorithm until all interior black points have been turned into white points. The alert 
student will realize that if the interior points are already known, they can all be turned 
simply into white points thus filling the spheres without having to do region filling as a 
separate procedure. 

If the spheres are allowed to touch in arbitrary ways, a way must be found to separate 
them because they could create ’’pockets” of black points surrounded by all white or 
surrounded by all white and part of the boundary of the image. The simplest approach 
is to separate the spheres by preprocessing. One way to do this is to erode the white 
components of the image by one pass of a 3 x 3 mask, effectively creating a black 
border around the spheres, thus ’’separating” them. This approach works in this case 
because the objects are spherical, thus having small areas of contact. To handle the 
case of spheres touching the border of the image, we simply set all border point to 
black. We then proceed to find all background points To do this, we pick a point on the 
boundary of the image (which we know is black due to preprocessing) and find all black 
points connected to it using a connected component algorithm (Section 9.5.3). These 
connected components are labels with a value different from 1 or 0. The remaining 
black points are interior to spheres. We can fill all spheres with white by applying the 
region filling algorithm until all such interior black points have been turned into white 
points. The alert student will realize that if the interior points are already known, they 
can all be turned simply into white points thus filling the spheres without having to do 
region filling as a separate procedure. 

Note that the erosion of white areas makes the black areas interior to the spheres grow, 
so the possibility exists that such an area near the border of a sphere could grow into the 
background. This issue introduces further complications that the student may not have 
the tools to solve yet. We recommend making the assumption that the interior black 
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areas are small and near the center. Recognition of the potential problem by the student 
should be sufficient. 


Problem 9.24 

Denote the original image by A. Create an image of the same size as the original, 
but consisting of all 0’s, call it B. Choose an arbitrary point labeled 1 in A, call it 
Pi, and apply the algorithm. When the algorithm converges, a connected component 
has been detected. Label and copy into B the set of all points in A belonging to the 
connected components just found, set those points to 0 in A and call the modified image 
A±. Choose an arbitrary point labeled 1 in A -| , call it p 2 , and repeat the procedure just 
given. If there are K connected components in the original image, this procedure will 
result in an image consisting of all 0’s after K applications of the procedure just given. 
Image B will contain K labeled connected components. 

Problem 9.25 

(a) Equation (9.6-1) requires that the (x, y ) used in the computation of dilation must 
satisfy the condition (x, y) £ Di,. In terms of the intervals given in the problem state- 
ment, this means that x and y must be in the closed interval x £ [B x i,B x 2 ) and 
y £ [Byi , B y2 ] . It is required also that (s — x),(t — y) £ Df , which means that 
(s — x) £ [F x 1 , F x 2 ] and (t — y) £ [F y -\ , F y2 \. Since the valid range of x is the interval 
\Fxii B x 2 ], the valid range of (s — x) is [s — B x \ . s — B x 2 ]. But, since x must also satisfy 
the condition (s — x) £ [F xl , F x2 ], it follows that F xl < s — B xl and F x2 > s — B x2 , 
which finally yields F x \ + B x \ < s < F x2 + B x2 . Following the same analysis for t 
yields F y 1 + B y \ < t < F y2 + B y2 . Since dilation is a function of (s,f), these two 
inequalities establish the domain of (/ © b)(s, t ) in the sf-plane. 

(b) Following a similar procedure yields the following intervals for s and /: F x \ — B x 1 < 
s < F x 2 — B x2 and F y 1 — B y 1 < t < F y2 — B y2 . Since erosion is a function of (s, t), 
these two inequalities establish the domain of (/ © b)(s, t) in the sf-plane. 

Problem 9.26 


(a) The noise spikes are of the general form shown in Fig. P9. 26(a), with other possi- 
bilities in between. The amplitude is irrelevant in this case; only the shape of the noise 
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spikes is of interest. To remove these spikes we perform an opening with a cylindri- 
cal structuring element of radius greater than f? max , as shown in Fig. P9.26(b) (see Fig. 
9.30 for an explanation of the process). Note that the shape of the structuring element is 
matched to the known shape of the noise spikes. 


(b) The basic solution is the same as in (a), but now we have to take into account the 
various possible overlapping geometries shown in Fig. P9. 26(c). A structuring element 
like the one used in (a) but with radius slightly larger than 4/?, rnax will do the job. Note in 
(a) and (b) that other parts of the image would be affected by this approach. The bigger 
f? max , the bigger the structuring element that would be needed and, consequently, the 
greater the effect on the image as a whole. 




2 R 


2 R (a) 2 R 2R 


Basic 

geometries 


Profiles 



(0 


Figure P9.26 


Problem 9.27 


(a) Color the image border pixels the same color as the particles (white). Call the result- 
ing set of border pixels B. Apply the connected component algorithm. All connected 
components that contain elements from B are particles that have merged with the border 
of the image. 
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(b) It is given that all particles are of the same size (this is done to simplify the problem; 
more general analysis requires tools from Chapter 11). Determine the area (number of 
pixels) of a single particle; denote the area by R. Eliminate from the image the particles 
that were merged with the border of the image. Apply the connected component algo- 
rithm. Count the number of pixels in each component. A component is then designated 
as a single particle if the number of pixels is less than or equal to R + e, where e is a 
small quantity added to account for variations in size due to noise. 

(c) Subtract from the image single particles and the particles that have merged with the 
border, and the remaining particles are overlapping particles. 


Problem 9.28 


As given in the problem statement, interest lies on deviations from the round in the inner 
and outer boundaries of the washers. It also is stated that we can ignore errors due to 
digitizing and positioning. This means that the imaging system has enough resolution so 
that artifacts will not be introduced as a result of digitization. The mechanical accuracy 
similarly tells us that no appreciable errors will be introduced as a result of positioning. 
This is important if we want to do matching without having to register the images. 

The first step in the solution is the specification of an illumination approach. Because 
we are interested in boundary defects, the method of choice is a backlighting system that 
will produce a binary image. We are assured from the problem statement that the illu- 
mination system has enough resolution so that we can ignore defects due to digitizing. 

The next step is to specify a comparison scheme. The simplest way to match binary 
images is to AND one image with the complement of the other. Here, we match the 
input binary image with the complement of the golden image (this is more efficient than 
computing the complement of each input image and comparing it to the golden image). 
If the images are identical (and perfectly registered) the result of the AND operation will 
be all 0’s. Otherwise, there will be l’s in the areas where the two images do not match. 
Note that this requires that the images be of the same size and be registered, thus the 
assumption of the mechanical accuracy given in the problem statement. 

As noted, differences in the images will appear as regions of l’s in the AND image. 
These we group into regions (connected components) by using the algorithm given in 
Section 9.5.3. Once all connected components have been extracted, we can compare 
them against specified criteria for acceptance or rejection of a given washer. The sim- 


http://librosysolucionarios.net 



Problem 9.28 139 


plest criterion is to set a limit on the number and size (number of pixels) of connected 
components. The most stringent criterion is 0 connected components. This means a 
perfect match. The next level for ’’relaxing” acceptance is one connected component 
with of size 1, and so on. More sophisticated criteria might involve measures like the 
shape of connected components and the relative locations with respect to each other. 
These types of descriptors are studied in Chapter 1 1 . 
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Problem 10.1 

The masks would have the coefficients shown in Fig. PI 0.1. Each mask would yield 
a value of 0 when centered on a pixel of an unbroken 3 -pixel segment oriented in the 
direction favored by that mask. Conversely, the response would be a +2 when a mask 
is centered on a one-pixel gap in a 3 -pixel segment oriented in the direction favored by 
that mask. 
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0 

1 

0 

0 

0 

0 
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Horizontal Vertical +45" -45° 

Figure P10.1 


Problem 10.2 


The key to solving this problem is to find all end points of line segments in the image. 
End points are those points on a line which have only one 8-neighbor valued 1 . Once all 
end points have been found, the £>§ distance between all pairs of such end points gives 
the lengths of the various gaps. We choose the smallest distance between end points 
of every pair of segments and any such distance less than or equal to L satisfies the 
statement of the problem. This is a rudimentary solution, and numerous embellishments 
can be added to build intelligence into the process. For example, it is possible for end 
points of different, but closely adjacent, lines to be less than L pixels apart, and heuristic 
tests that attempt to sort out things like this are quite useful. Although the problem 
statement does not call for any such tests, they are normally needed in practice and it is 


http://librosysolucionarios.net 


142 Chapter 10 Problem Solutions 


worthwhile to bring this up in class if this particular problem is assigned as a homework 
assignment. 


Problem 10.3 

(a) The lines were thicker than the width of the line detector masks. Thus, when, for 
example, a mask was centered on the line it ’’saw” a constant area and gave a response 
of 0. 

(b) Via connectivity analysis. 

Problem 10.4 


It is given that the location of the edge relative to the size of the mask is such that image 
border effects can be ignored. Assume that n is odd and keep in mind that an ideal step 
edge transition takes place between adjacent pixels. Then, the average is 0 until the 
center of the mask is (n — l)/2 pixels or more to the left of the edge. The average is 
1 when the center of the mask is further away than (n — l)/2 pixels to the right of the 
edge. When transitioning into the edge, (say from left to right) the average picks up one 
column of the mask for every pixel that it moves to the right, so the value of the average 
grows as n/n 2 , 2n/n 2 , . . . , (n — 1) n/n 2 , n 2 /n 2 , or 1/n, 2/n, . . . , (n — 1 )/n, 1. This 
is a simple linear growth with slope equal to 1/n. Figure PI 0.4 shows a plot of the 
original profile and what the profile would look like after smoothing. Thus, we get a 
ramp edge, as expected. 

(n - 1 )/2 



http://librosysolucionarios.net 



Problem 10.3 143 


Problem 10.5 


The gradient and Laplacian (first and second derivatives) are shown in Fig. P10.5. 



Figure P10.5 


Problem 10.6 


(a) Inspection of the Sobel masks shows that G x = 0 for edges oriented vertically and 


G„ 


0 for edges oriented horizontally. Therefore, it follows in this case that , for 


vertical edges, V/ = , G“l = \G y | , and similarly for horizontal edges. 


(b) The same argument applies to the Prewitt masks. 


Problem 10.7 


Consider first the Sobel masks of Figs. 10.8 and 10.9. The easiest way to prove that 
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these masks give isotropic results for edge segments oriented at multiples of 45° is to 
obtain the mask responses for the four general edge segments shown in Fig. PI 0.7, 
which are oriented at increments of 45°. The objective is to show that the responses 
of the Sobel masks are indistinguishable for these four edges. That this is the case is 
evident from Table PI 0.1, which shows the response of each Sobel mask to the four 
general edge segments. We see that in each case the response of the mask that matches 
the edge direction is (4a — 46), and the response of the corresponding orthogonal mask 
is 0. The response of the remaining two masks is either (3a — 36) or (36 — 3a). The 
sign difference is not significant because the gradient is computed by either squaring or 
taking the absolute value of the mask responses. The same line of reasoning applies to 
the Prewitt masks. 


Table PI 0.7 


Edge 

direction 

Horizontal 
Sobel (G x ) 

Vertical 
Sobel (G v ) 

+45° 

Sobel (G 45 ) 

-45° 

Sobel (G_ 45 ) 

Horizontal 

4a — 46 

0 

3a — 36 

36 — 3a 

Vertical 

0 

4a — 46 

3a — 36 

3a — 36 

+45° 

3a — 36 

3a — 36 

4a — 46 

0 

-45° 

36 — 3a 

3a — 36 

0 

4a — 46 


b 

b 

b 

b 

a 

a 

b 

b 

a 


a 

a 

a 

a 

a 

a 

b 

a 

a 

b 

a 

a 

b 

a 

a 

a 

a 

a 

b 

a 

a 

a 

a 

a 

b 

b 

a 


Horizontal Vertical +45 -45 


Figure P10.7 


Problem 10.8 


With reference to Fig. P10.8, consider first the 3x3 smoothing mask mentioned in the 
problem statement, as well as the general subimage area shown in the figure. Recall that 
value e is replaced by the response of the 3x3 mask when its center is at that location. 
Ignoring the 1/9 scale factor, the response of the mask when centered at that location is 
( Q + b + c + d + e + f + g + h + /). 

The idea with the one-dimensional mask is the same: We replace the value of a pixel by 
the response of the mask when it is centered on that pixel. With this in mind, the mask 
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[111] would yield the following responses when centered at the pixels with values b , e, 
and h, respectively: (a + b + c), (d + e + /), and (g + h + i). Next, we pass the mask 

' 1 ' 

1 

_ 1 _ 

through these results. When this mask is centered at the pixel with value e, its response 
will be [(a + fr + c) + (d + e + f) + ( g + h + i )], whichis the same as the result produced 
by the 3x3 smoothing mask. 

Returning now to problem at hand, when the G x Sobel mask is centered at the pixel with 
value e, its response is G x = (g + 2 h + i) — (a + 2b + c). If we pass the one-dimensional 
differencing mask 

' -1 ' 

0 

1 

through the image, its response when its center is at the pixels with values d , e, and /, 
respectively, would be: (g— a ) , (h — b), and (i — c). Next we apply the smoothing mask 
[1 2 1] to these results. When the mask is centered at the pixel with value e, its response 
would be [(g — a) + 2 (h—b) + (i — c)] which is [(g + 2h + i) — (a + 26 + c)]. This is the 
same as the response of the 3x3 Sobel mask for G x . The process to show equivalence 
for G y is basically the same. Note, however, that the directions of the one-dimensional 
masks would be reversed in the sense that the differencing mask would be a column 
mask and the smoothing mask would be a row mask. 


1 

1 

1 

1 

1 

1 

1 

1 

1 


Smoothing mask 
(scaled by 1/9). 
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b 

c 

d 

e 

f 
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h 
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Subimage area under 
the mask at any one 
lime. 


Figure P10.8 


http://librosysolucionarios.net 





146 Chapter 10 Problem Solutions 


Problem 10.9 


The solution is shown in Fig. P10.9 (negative numbers are shown underlined). 

Edge direction 
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Figure P10.9 


Problem 10.10 


(a) The solution is shown in Fig. P10. 10(a). The numbers in brackets are values of 
[G x , G y \. (b) The solution is shown in Fig. P10. 10(b). The angle was not computed 
for the trivial cases in which G x = G y = 0.. The histogram follows directly from this 
table, (c) The solution is shown in Fig. P10. 10(c). 


Problem 10.11 


(a) With reference to Eq. (10.1-17), we need to prove that 


/ 


2 2 

r — a 


e 2 ^ dr = 0. 


Expanding this equation results in the expression 


/ 


2 2 
r z — cr z 


e 2 ^ dr = — / r e ^ dr 


oo 

-hS 


e 2<t2 


Recall from the definition of the Gaussian density that 


1 




/ 


e dr = l 
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and, from the definition of the variance of a Gaussian random variable that 


Var(r) = o 2 


I 


r 2 e 2^- dr. 


Thus, it follows from the preceding equations that 


/ 


r r 2 - cr 2 1 


e 2 ^ dr 


V2na 2 n V2na 2 


= 0 . 
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(c) 


Figure P10.10 


(b) Suppose that we convolve an image / with V 2 /i. Using the convolution theorem, this 
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is the same as multiplying the Fourier transform of / by the Fourier transform of V J //. 
The average value of the convolution can be obtained by evaluating the Fourier transform 
of this product at the origin of the frequency plane [see Eq. (4.2-22)]. But, it was shown 
in (a) that the average value of V 2 /t is zero, which means that its Fourier transform is 
zero at the origin. From this it follows that the value of the product of the two Fourier 
transforms is also zero, thus proving that the average value of the convolution of / with 
V 2 /i is zero. 

(c) Yes. Consider Eq. (10.1-14), expressed as 
V 2 /(x, y) = 4 f(x, y) - [/( x + 1, y) + f(x - 1, y) + f(x, y + 1) + f(x, y - 1)]. 
As in (b), we evaluate the average value of a spatial expression by looking at the value 
of its Fourier transform at the origin. Flere, it follows from Eq. (4.6-2) that, if F(u, v) 
denotes the Fourier transform of f(x,y ), then the transforms of all the terms inside 
the brackets in the above equation are F(u, v) multiplied by appropriate exponential 
terms. Flowever, the exponential terms have value 1 at the origin, so the net result is 
4.F(0,0) — 1 F(0, 0) = 0, thus proving that the Laplacian obtained by convolving an 
image with the operator shown in Fig. 10.13 (which implements Eq. (10.1-14)] has an 
average value of zero. The same zero result is obtained for Eq. (10.1-15). 


Problem 10.12 


(a) Figure 10.15(g) was obtained from Fig. 10.15(h) which is a binary image, and thus 
consists of sets of connected components of l’s (see Section 2.5.2 regarding connected 
components). The boundary of each connected component forms a closed path (Prob- 
lem 2.14). The contours in Fig. 10.15(g) were obtained by noting transitions of the 
boundaries of the connected components with the background, and thus form closed 
paths. 

(b) The answer is yes for functions that meet certain mild conditions, and if the zero 
crossing method is based on rotational operators like the FoG function. Geometrical 
properties of zero crossings in general are explained in some detail in the paper ”On 
Edge Detection,” by V. Torre and T. Poggio, IEEE Trans. Pattern Analysis and Machine 
Intel!., vol. 8, no. 2, pp. 147-163. Booking up this paper and becoming familiar with 
the mathematical underpinnings of edge detection is an excellent reading assignment for 
graduate students. 
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Problem 10.13 

(a) Point 1 has coordinates x = 0 and y = 0. Substituting into Eq. (10.2-3) yields 
p = 0, which, in a plot of p vs. 9, is a straight line. 

(b) Only the origin (0, 0) would yield this result. 

(c) At 9 = +90°, it follows from Eq. ( 10.2-3) that x ■ (0) + y ■ (1) = p, or y = p. At 
9 = — 90°, x ■ (0) + y • (—1) = p , or — y = p. Thus the reflective adjacency. 

Problem 10.14 

(a) Express xcos9 + ysin9 = p in the form x = —(cot 9)x + p/ sin 9. Equating terms 
with the slope -intercept form, y = ax + b, gives a = and — (cot#) and b = p/ sind. 
This gives 9 = cot -1 (a) and p = b sin 0. Once obtained from a and & of a given line, 
the parameters 9 and p completely specify the normal representation of that line. 

(b) 9 = cot _1 (2) = 26.6° and p = (1) sin# = 0.45. 

Problem 10.15 


This problem is a natural for the Hough transform, which is set up as follows: The 9 axis 
is divided into six subdivisions, corresponding to the six specified directions and their 
error bands. For example (since the angle directions specified in the problem statement 
are with respect to the horizontal) the first band for angle 9 extends from —30° to —20°, 
corresponding to the —25° direction and its ±5° band. The p axis extends from p = 
—\[T) to p - +VD , where D is the largest distance between opposite corners of the 
image, properly calibrated to fit the particular imaging set up used. The subdivisions in 
the p axis are chosen finely enough to resolve the minimum expected distance between 
tracks that may be parallel, but have different origins, thus satisfying the last condition 
of the problem statement. 

Set up in this way, the Hough transform can be used as a ’’filter” to categorize all points 
in a given image into groups of points in the six specified directions. Each group is then 
processed further to determine if its points satisfy the criteria for a valid track: ( 1 ) each 
group must have at least 100 points; and (2) it cannot have more than three gaps, each of 
which cannot be more than 10 pixels long (see Problem 10.2 on the estimation of gaps 
of a given length). 
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Problem 10.16 


(a) The paths are shown in Fig. P10.16. These paths are as follows: 


1 

(1, 1)(1, 2) - 

-> (2,1)(2,2) - 

(3, 1) (3, 2) 



2 

(1, 1)(1> 2) - 

(2, 1) (2, 2) - 

(3, 2) (2, 2) - 

^ (3, 2) (3, 3) 


3 

(1, 1)(1> 2) - 

(2, 2) (1, 2) - 

- (2,2)(2,3) - 

-(3,2)(3,3) 


4 

(1, 1)(1> 2) - 

-k (2,2)(1,2) - 

(2, 2) (2, 3) - 

"*■: (2, 2) (3, 2) - 

■*■ (3, 1) (3, 2) 

5 

(1,2)(1,3) - 

(2, 2) (2, 3) - 

(3, 2) (3, 3) 



6 

(1,2)(1,3) - 

(2, 2)(2, 3) - 

■* (2, 2) (3, 2) - 

+ (3, 1) (3, 2) 


7 

(1,2)(1,3) - 

- (1,2)(2,2) - 

- (2,1)(2,2) - 

i (3, 1) (3, 2) 


8 

(1,2)(1,3) - 

- (1,2)(2,2) - 

- (2,1)(2,2) - 

* (3, 2)(2, 2) - 

(3, 2) (3, 3) 


(b) From Fig. 10.24 and (a), we see that the optimum path is path 6. Its cost is c = 
2 + 0 + 1 + 1 = 4. 


1 2 3 4 8 7 6 5 



(1,3) 


(2,3) 


(3,3) 


Problem 10.17 


From Eq. (10.2-6), c(p , q) = H — [f(p) — /(<?)]• In this case H = 8. Assume thatp is to 
the right as the image is traversed from left to right. The possible paths are shown in Fig. 
P10. 17(a). The costs are detailed in Fig. P10. 17(b). The graph (with the minimum-cost 
path shown dashed) is shown in Fig. P10. 17(c). Finally, the edge corresponding to the 
minimum-cost path is shown in Fig. P10. 17(d). 
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(i,D 


( 1 , 2 ) 


0,3) 



(a) 


[2] 

in 

[0] 

• 

• 

• 

8 - [1 - 2] = 9 

8 - [1 - 1] = 8 

8 - [7 - 0] — 


8 - [1 - 1] = 8 

8 - [7 - 1 ] = 2 

[!] 1 

1ml 

I [7] 


8- [1-1] = 8 8 -[1-7] =14 

8 -[6-1] = 3 8 - [8 - 1] = 1 8 -[2 -7] =13 



(C) 


[ 6 ] 


[ 8 ] 

(b) 


[ 2 ] 


(d) 


Figure P10.17 


Problem 10.18 


(a) The number of boundary points between black and white regions is much larger in 
the image on the right. When the images are blurred, the boundary points will give rise 
to a larger number of different values for the image on the right, so the histograms of the 
two blurred images will be different. 

(b) To handle border effects, we surround the image with a border of 0’s. We assume 
that the image is of size N x N (the fact that the image is square is evident from the 
right image in the problem statement). Blurring is implemented by a 3 x 3 mask whose 
coefficients are 1/9. Figure PI 0.1 8 shows the different types of values that the blurred 
left image (see problem statement) will have. These values are summarized in Table 
P10.18-1. It is easily verified that the sum of the numbers on the left column of the 
table is N 2 . 
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Table P10.18-1 


No. of Points 

Value 

AT(f-l) 

0 

2 

2/9 

to 

3/9 

4 

4/9 

00 

1 

CO 

6/9 

(N- 2)(f -2) 

1 


A histogram is easily constructed from the entries in this table. A similar (tedious, but 
not difficult) procedure yields the results shown in Table P10.18-2 for the checkerboard 
image. 


Table P10.18-2 


No. of Points 

Value 

^ - 141V + 98 

0 

28 

2/9 

IAN - 224 

3/9 

128 

4/9 

98 

5/9 

161V - 256 

6/9 

^ - 161V + 128 

1 


Nf 2 pixels | A/72 pixels 


— 0 

0 

0 •• 

•0 0 0 0 - ■ 

•0 0 

0 

4/9 

6/9 

•• 6/9 4/9 2/9 0 " 

•0 0 

„ 0 

619 

1 

•• 1 6/9 3/9 0 " 

■0 0 

£ 





S, 





*0 

619 

1 

•• 1 6/9 3/9 0 " 

■0 0 

0 

4/9 

6/9 •• 

•• 6/9 4/9 2/9 0 " 

■0 0 

_0 

0 

0 

•0 0 0 0 • • 

•0 0 



Y _ 

Border of 0’s 



Figure P10.18 


Problem 10.19 

The gray level profile of one row of the image is shown in Fig. PI 0.1 9(a), and the 
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histogram of the image is shown in Fig. P10. 19(b). The gray level profile of one 
row in the wedge image is shown in Fig. P10. 19(c), and its histogram is shown in 
Fig. PI 0.1 9(d). The gray level profile of a row in the product image is shown in Fig. 
P10. 19(e). The histogram of the product is shown in Fig. P10. 19(f). 

Gray 1 evd Probab ility 



Gray lcvd 


Probability 



Figure P10.19 


Problem 10.20 


(a) Ai = A 2 and a\ = oi = a, which makes the two modes identical. If the number 
of samples is not large, convergence to a value at or near the mid point between the two 
means also requires that a clear valley exist between the two modes. We can guarantee 
this by assuming that a << (m i + m . 2 )/2. 

(b) That this condition cannot happen if .42 / 0. This is easily established by starting 
the algorithm with an initial value less than Even if the right mode associated with 
m 2 is much smaller in size (e.g., ,4 1 >> A2 and <7i » a 2 ) the average value of the 
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region to the left of the starting threshold will be smaller than the average of the region to 
the right because the modes are symmetrical about their mean, and the mode associated 
with m 2 will bias the data to the right. Thus, the next iterative step will bring the value 
of the threshold closer to m -\ , and eventually to the right of it. This analysis assumes that 
enough points are available in order to avoid pathological cases in which the algorithm 
can get ’’stuck” due to insufficient data that truly represents the shapes assumed in the 
problem statement. 

(c) (J 2 >> <ti. This will ’’draw” the threshold toward m 2 during iteration. 


Problem 10.21 


The illumination function is a bell-shaped surface with its center at (500, 500). The 
value of illumination at this point is 1, and it decreases radially from there. Draw a 
series of concentric circles about point (500, 500) so that the value of i(x, y) at each 
circle is 0.1 less than the circle before. Any two points within these two circles do 
no differ by more than 10% in illumination. Segment (threshold) the region between 
adjacent circles. If the distance between circles is greater than 10 pixels, then we are told 
that the segmentation will be correct. That is, proper segmentation of areas greater than 
10 x 10 pixels is guaranteed in the problem statement, as long as illumination between 
any two points does not differ by more than 10%. Regions of 10 x 10 pixels will fit 
between concentric circles that are more than 10 pixels apart. If the distance between 
circles is less than 10 pixels, then the segmentation is not guaranteed to be perfect. 
But, there is nothing that can be done about that because changes in illumination are 
determined by the illumination function, which is given. 


Problem 10.22 


From the figure in the problem statement, 

f 0 


Pi ( z ) = < 



and 


M Z ) = < 


0 


\ z + 1 


0 


z <1 
l<z<3 
z > 3 

z <0 

0<z <2 . 
z > 2 
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The optimum threshold is the value z — T for which P±p± ( T ) 
Pi = P' 2 , so 

-T — - = --T+ 1 
2 2 2 

from which we get T = 1.5. 


P 2 p 2 (T) . In this case 


Problem 10.23 


Keeping the same sense of directions as in Problem 10.22, let p 2 (z) be the probability 
density function given in the problem statement. The key in solving the problem is to 
recognize that the direction of the ’’tail” of the Rayleigh function can be reversed as 
follows: 

f §(— 2 + c)e - ( -2+c ) 2 / d z<c 

Pl(z)=< „ ~ ■ 

0 z > c 

Then, the optimum threshold, T, is found by solving the following equation for T: 

PiPiiT) = P 2 p 2 {T). 

Substituting the density functions into these equations yields 

Pl \{~ T + c)e- ( - T+c)2/d = P 2 j{T- a)e- (T - a)2/d 
which must be solved for T to find the optimum threshold. With the exception of some 
possible additional reformatting (like taking the natural log), this is as far as we normally 
expect students to carry this problem. However, it is important for the student to state 
that the solution is valid only in the range a <T < c. 


Problem 10.24 


From Eq. (10.3-10), 

PiPi{T) = P 2 p 2 (T). 

Taking the In of both sides yields 

In Pi + lnpi(T) = In P 2 + In p 2 {T). 

But 


and 


so it follows that 


1 gjza r 

MT) = v^ e 
1 

n(T)= v^ e 


in p, + in -=L — l 2— ^ l)- = in p 2 + in 1 


V2^*i 


V2ncr 2 


{T-p 2 ? 

2cr| 
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In Pi - him - (T 0 - lnP2 + lna 2 + ^ ^ = 0 

2af 2 a 2 

ln T 2 + ln % ^2 (T 2 ~ 2 ^1 T + M?) + ^ - 2 h 2 T + nl)= 0 

In — +T 2 ( — ^ ^ 

(T 1 P 2 \2c 2 2crf 

From this expression we get 

AT 2 + BT + C = 0 

with 

A = (erf -a 2 2 ) 

B = 2 (aln ± - crln 2 ) 

and 

C = a\nl - a\nl + 2a\al ln ^L. 

mP 2 


'P I ft J ft 

1 t 2 


M5 


Ml 


2cr| 2crf 


= 0. 


Problem 10.25 


If <ti — ct 2 = <J, then A = 0 in Eq. ( 10.3-12) and we have to solve the equation 

BT + C = 0 

with 

B = 2<t 2 (/x 1 -/x 2 ) 

and 

C = a 2 (^l ~~ Mi ) + 2cr 4 ln . 


Substituting and cancelling terms gives 


P> 


p 

2(mi - M 2 )T ^ (Mi + M 2 )(Mi - M 2 ) + 2o- 2 In = 0 

■L 2 


or 


T = 


Ml + M 2 


—>.4- 

Mi M 2 -*2 


Problem 10.26 


The simplest solution is to use the given means and standard deviations to form two 
Gaussian probability density functions, and then to use the optimum thresholding ap- 
proach discussed in Section 10.3.5 fin particular, see Eqs. (10.3-11) through (10.3-13). 
The probabilities Pi and P 2 can be estimated by visual analysis of the images (i.e., by 
determining the relative areas of the image occupied by objects and background). It is 
clear by looking at the image that the probability of occurrence of object points is less 
than that of background points. Alternatively, an automatic estimate can be obtained by 
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thresholding the image into points with values greater than 200 and less than 1 10 (see 
problem statement). Using the given parameters, the results would be good estimates of 
the relative probability of occurrence of object and background points due to the separa- 
tion between means, and the relatively tight standard deviations. A more sophisticated 
approach is to use the Chow-Kaneko procedure discussed in Section 10.3.5. 


Problem 10.27 

Let mi and m 2 denote the mean gray level of objects and background, respectively, and 
let a 1 and a 2 denote the corresponding standard deviations (see the problem statement 
for specific values). We note that ± 2 <T 2 about the mean background level gives a range of 
gray level values from 80 to 140, and that ± 2 ( 7 ! about the mean intensity of the objects 
gives a range of 120 to 280, so a reasonable separation exists between the two gray level 
populations. Choosing rn-\ = 200 as the seed value is quite adequate. Regions are 
then grown by appending to a seed any point that is 8 -connected to any point previously 
appended to that seed, and whose gray level is m\ ± 2 < 7 - 1 . 

Problem 10.28 

The region splitting is shown in Fig. P10.28(a). The corresponding quadtree is shown 
in Fig. P10.28(b). 

Problem 10.29 


(a) The elements of T[n] are the coordinates of points in the image below the plane 
g(x, y ) = n, where n is an integer that represents a given step in the execution of the 
algorithm. Since n never decreases, the set of elements in T [n — 1] is a subset of the el- 
ements in T[n\. In addition, we note that all the points below the plane g(x, y) = n - I 
are also below the plane g(x, y) = n, so the elements of T[n] are never replaced. Sim- 
ilarly, C n (Mi ) is formed by the intersection of C(M t ) and T[n], where C(Mj) (whose 
elements never change) is the set of coordinates of all points in the catchment basin as- 
sociated with regional minimum M; . Since the elements of C(M;) never change, and 
the elements of T[n] are never replaced, it follows that the elements in C„ ( M, ) are never 
replaced either. In addition, we see that C n -i(Mi) C C„(Mj). 

(b) This part of the problem is answered by the same argument as in (a). Since (1) n 


http://librosysolucionarios.net 



158 Chapter 10 Problem Solutions 


always increases; (2) the elements of neither C„(Mj) nor T[n\ are ever replaced; and 
(3) T[n — 1] C T[n] and C C n {Mi), it follows that the number of elements 

of both C n {Mi ) and T[n] either increases or remains the same. 
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Problem 10.30 


Using the terminology of the watershed algorithm, a break in a boundary between two 
catchment basins would cause water between the two basins to merge. However, the 
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heart of the algorithm is to build a dam higher than the highest gray level in the image 
any time a break in such boundaries occurs. Since the entire topography is enclosed by 
such a dam, dams are built any time there is a break that causes water to merge between 
two regions, and segmentation boundaries are precisely the tops of the dams, it follows 
that the watershed algorithm always produces closed boundaries between regions. 


Problem 10.31 


The first step in the application of the watershed segmentation algorithm is to build a 
dam of height max + 1 to prevent the rising water from running off the ends of the 
function, as shown in Fig. P10.31(b). For an image function we would build a box of 
height max + 1 around its border. The algorithm is initialized by setting C [ I = T[l]. 
In this case, T[l] = {<?( 2)}, as shown in Fig. P10. 31(c) (note the water level). There is 
only one connected component in this case: Q[l] = {gi} = {g{ 2 )}. 

Next, we let n = 2 and, as shown in Fig. P10. 31(d), T[ 2] = (gr(2), <?(14)} and 
Q[ 2] = { q-[ ; q 2 } , where, for clarity, different connected components are separated by 
semicolons. We start construction of 6 ' [2] by considering each connected component in 
Q[ 2]. When q = q\, the term q n C[l] is equal to (g(2)}, so condition 2 is satisfied and. 
therefore, (7[2] = (gr(2)}. When q = q 2 , q FI (7[1] = 0 (the empty set) so condition 
1 is satisfied and we incorporate q in C[ 2 ], which then becomes C[ 2 ] = {gr( 2 ); g(14)} 
where, as above, different connected components are separated by semicolons. 

When n = 3 [Fig. P10.31(e)], T[ 3] = {2, 3, 10, 11, 13, 14} and Q[ 3] = {qr, q 2 , q 3 } = 
{2, 3; 10, 11; 13, 14} where, in order to simplify the notation we let k denote g(k). Pro- 
ceeding as above, qi n C[ 2 ] = { 2 } satisfies condition 2, so <j\ is incorporated into the 
new set to yield C [3] = {2,3; 14}. Similarly, q 2 D C[2] =0 satisfies condition 1 and 
C[3] = {2,3; 10, 11; 14}. Finally, q 3 D C[ 2] = {14} satisfies condition 2 and C[3] = 
{2, 3; 10, 11; 13, 14}. It is easily verified that C[ 4] = C[3] = {2, 3; 10, 11; 13, 14}. 

When n = 5 [Fig. P10.3KT)], we have, T[ 5] = {2,3,5,6,10,11,12,13,14} and 
Q[5] = {qr, q 2 ] 93 } = {2, 3; 5, 6 ; 10, 11, 12, 13, 14} (note the merging of two previously 
distinct connected components). Is is easily verified that q\ D C[ 4] satisfies condition 2 
and that q 2 D 6'[1] satisfied condition 1. Proceeding with these two connected compo- 
nents exactly as above yields C[5| = {2, 3; 5, 6 ; 10, 11; 13, 14} up to this point. Things 
get more interesting when we consider q 3 . Now, q 3 D C\ 4] = {10, 11; 13, 14} which, 
since it contains two connected components of C\ 4] satisfies condition 3. As mentioned 
previously, this is an indication that water from two different basins has merged and a 
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dam must be built to prevent this. Dam building is nothing more than separating c /3 into 
the two original connected components. In this particular case, this is accomplished by 
the dam shown in Fig. P10. 31(g), so that now c /3 = {^si ; c/32 } = {10, 11; 13, 14}. Then, 
c /31 nC[4] and c /32 fl C[ 4] each satisfy condition 2 and we have the final result for n = 5, 
C[ 5] = {2, 3; 5, 6; 10, 11; 13; 14}. 

Continuing in the manner just explained yields the final segmentation result shown in 
Fig. P10.3 1(h), where the ’’edges” are visible (from the top) just above the water line. A 
final post-processing step would remove the outer dam walls to yield the inner edges of 
interest. 


Problem 10.32 


With reference to Eqs. (10.6-4) and (10.6-3), we see that comparing the negative ADI 
against a positive, rather than a negative, threshold would yield the image negative of 
the positive ADI. The result is shown in the left of Fig. P10.32. The image on the right 
is the positive ADI from Fig. 10.49(b). We have included it here for convenience in 
making the comparison. 



r 


Figure P10.32 
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Problem 10.33 


(a) True, assuming that the threshold is not set larger than all the differences encountered 
as the object moves. The easiest way to see this is to draw a simple reference image, 
such as the white rectangle on a black background. Let that rectangle be the object that 
moves. Since the absolute ADI image value at any location is the absolute difference 
between the reference and the new image, it is easy to see that as the object enters areas 
that are background in the reference image, the absolute difference will change from 
zero to nonzero at the new area occupied by the moving object. Thus, as long as the 
object moves the dimension of the absolute ADI will grow. 

(b) True. The positive ADI is stationary and equal to the dimensions of the moving 
object because the differences between the reference and the moving object never exceed 
the threshold in areas that are background in the reference image (assuming as Eq. (10.6- 
3) that the background has lower values than the object). 

(c) True. From Eq. (10.6-4), we see that difference between the background and the 
object will always be negative (assuming as in Eq. (10.6-4) that the gray levels in the ob- 
ject exceed the value of the background). Assuming also that the differences are more 
negative than the threshold, we see for the same reason as in (a) that all new background 
areas occupied by the moving object will have nonzero counts, thus increasing the di- 
mension of the nonzero entries in the negative ADI (keep in mind that the values in this 
image are counts). 


Problem 10.34 


Consider first the fact that motion in the x-direction is zero. When all components 
of an image are stationary, g x (t,ai) is a constant, and its Fourier transform yields an 
impulse at the origin. Therefore, Fig. 10.53 would now consists of a single impulse at 
the origin. The other two peaks shown in the figure would no longer be present. To 
handle the motion in the positive y-dircction and its change opposite direction, recall 
that the Fourier transform is a linear process, so we can use superposition to obtain a 
solution. The first part of motion is in the positive y-direction at 1 pixel/frame. This 
is the same as in Example 10.2, so the peaks corresponding to this part of the motion 
are the same as the ones shown in Fig. 10.54. The reversal of motion is instantaneous, 
so the 33rd frame would show the object traveling in exactly the opposite direction. To 
handle this, we simply change a 2 to — a 2 in Eq. (10.6-7). Based on the discussion in 
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connection with Eq. (10.6-5), all this change would do is produce peaks at frequencies 
u = —a 2 V 2 and K + a 2 V 2 . From Example 10.21 we know that the value of 02 is 4. 
From the problem statement, we know that V 2 = 1 and K = 32. Thus, we have two 
new peaks added to Fig. 10.54: one at u = —4 and the other at 11 36. As noted 

above, the original peaks correspond to the motion in the positive y-dircction given in 
the problem statement, which is the same as in Example 10.21. Note that the frame 
count was restarted from 0 to 3 1 with the change in direction. 


Problem 10.35 


(a) It is given that 10% of the image area in the horizontal direction is occupied by a 
bullet that is 2.5 cm long. Since the imaging device is square (256 x 256 elements) the 
camera looks at an area that is 25 cm x 25 cm, assuming no optical distortions. Thus, 
the distance between pixels is 25/256=0.098 cm/pixel. The maximum speed of the bullet 
is 1000 m/sec = 100,000 cm/sec. At this speed, the bullet will travel 100,000/0.98 = 
1.02 x 10 6 pixels/sec. It is required that the bullet not travel more than one pixel during 
exposure. That is, (1.02 x 10 6 pixels/sec) x K sec < 1 pixel. So, K < 9.8 x 10 7 sec. 

b) The frame rate must be fast enough to capture at least two images of the bullet in 
successive frames so that the speed can be computed. If the frame rate is set so that 
the bullet cannot travel a distance longer (between successive frames) than one half the 
width of the image, then we have the cases shown in Fig. PI 0.35. In cases A and E 
we get two shots of the entire bullet in frames / 2 and O 3 and t\ and f 2 , respectively. 
In the other cases we get partial bullets. Although these cases could be handled with 
some processing (e.g., by determining size, leading and trailing edges, and so forth) it is 
possible to guarantee that at least two complete shots of every bullet will be available by 
setting the frame rate so that a bullet cannot travel more than one half the width of the 
frame, minus the length of the bullet. The length of the bullet in pixels is (2.5 cm)/(0.098 
cm/pixel) « 26 pixels. One half of the image frame is 128 pixels, so the maximum travel 
distance allowed is 102 pixels. Since the bullet travels at a maximum speed of 1.02 x 10 6 
pixels/sec, the minimum frame rate is 1.02 x 10 6 /102 = 10 4 frames /sec. 

(c) In a flashing situation with a reflective object, the images will tend to be dark, with 
the object shining brightly. The techniques discussed in Section 10.6.1 would then be 
quite adequate. 

(d) First we have to determine if a partial or whole image of the bullet has been obtained. 
After the pixels corresponding to the object have been identified using motion segmen- 
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tation, we determine if the object runs into the left boundary (see the solution to Problem 
9.27) regarding a method for determining if a binary object runs into the boundary of an 
image). If it does, we look at the next two frames, with the assurance that a complete 
image of the bullet has been obtained in each because of the frame rate in (b). If the 
object does not run into the left boundary, we are similarly assured of two full shots in 
two of the three frames. We then compute the centroid of the object in each image and 
count the number of pixels between the centroids. Since the distance between pixels and 
the time between frames are known, computation of the speed is a trivial problem. The 
principal uncertainty in this approach is how well the object is segmented. However, 
since the images are of the same object in basically the same geometry, consistency of 
segmentation between frames can be expected. 
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Figure P10.35 
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Problem 11.1 

(a) The key to this problem is to recognize that the value of every element in a chain 
code is relative to the value of its predecessor. The code for a boundary that is traced 
in a consistent manner (e.g., clockwise) is a unique circular set of numbers. Starting 
at different locations in this set does not change the structure of the circular sequence. 
Selecting the smallest integer as the starting point simply identifies the same point in the 
sequence. Even if the starting point is not unique, this method would still give a unique 
sequence. For example, the sequence 101010 has three possible starting points, but they 
all yield the same smallest integer 010101. 

(b) Code: 1 1076765543322. The starting point is 0, yielding the sequence 

07676554332211 . 

Problem 11.2 

(a) The first difference only counts the number of directions that separate adjacent el- 
ements of the code. Since the counting process is independent of direction, the first 
difference is independent of boundary rotation. (It is worthwhile to point out to students 
that the assumption here is that rotation does not change the code itself). 

(b) Code: 0101030303323232212111. Difference: 3131331313031313031300. (Note 
that the code was treated as a circular sequence, so the first element of the difference is 
the transition between the last and first element of the code, as explained in the text). 

Problem 11.3 


(a) The rubber-band approach forces the polygon to have vertices at every inflection 
of the cell wall. That is, the locations of the vertices are fixed by the structure of the 
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inner and outer walls. Since the vertices are joined by straight lines, this produces the 
minimum-perimeter polygon for any given wall configuration. 

(b) If a corner of a cell is centered at a pixel on the boundary, and the cell is such that the 
rubber band is tightened on the opposite corner, we would have a situation as shown in 
Fig. PI 1.3. Assuming that the cell is of size dx d, the maximum difference between the 
pixel and the boundary in that cell is y/2< d. If cells are centered on pixels, the maximum 
difference is (v / 2d)/2. 


Pixel - 


Figure PI 1.3 


Problem 11.4 


(a) The resulting polygon would contain all the boundary pixels. 

(b) Actually, in both cases the resulting polygon would contain all the boundary pixels. 


Problem 11.5 


(a) The solution is shown in Fig. PI 1.5(b). (b) The solution is shown in Fig. PI 1.5(c). 
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Figure PI 1.5 


Problem 11.6 


(a) From Fig. PI 1.6(a), we see that the distance from the origin to the triangle is given 
by 


"( 8 ) = 


0° < 6 < 60° 


= — ^ 60° < 0 < 120° 

cos(120° — 6) 

= — ^ 120° <6 < 180° 

cos(180° — 9) 

= — ^ — — 180° <9 < 240° 

cos (240° — 9) 

= — ^ — — 240° <6 < 300° 

cos (300° — 6) 

= — ^ — — 300° <6 < 360° 

cos (360° -Q) 

where Do is the perpendicular distance from the origin to one of the sides of the triangle, 
and D = Dq/ cos(60°) = 2D(,. Once the coordinates of the vertices of the triangle are 
given, determining the equation of each straight line is a simple problem, and D 0 (which 
is the same for the three straight lines) follows from elementary geometry. 


120° < e < 180° 


180° <9 < 240° 


240° < 9 < 300° 


300° <9 < 360° 
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(b) From Fig. PI 1.6(b), 

r(9) = 


B 


2 cos 6 

A 


2 cos (90° -9) 
A 

2cos(0 — 90°) 
B 


2cos(180° - 9) 
B 


2 cos (9 - 180°) 
A 


2cos(270° - 9) 
A 


2cos(0 — 270) 
B 


2cos(360° - 9) 

where ip = tan _1 (7l/.B). 


0° < 9 < p 
p<9 < 90° 

90° < 9 < (180° - ip) 
(180° - ip) < 9 < 180° 
180° <9 < 180° + ip 
180° +ip<9 < 270° 
270° <9 < 270° + ip 
270° +ip < 9 < 360°. 


(c) The equation of the ellipse in Fig. PI 1.6(c) is 

x 2 y 2 

We are interested in the distance from the origin to an arbitrary point (x, y) on the ellipse. 
In polar coordinates, 

x = r cos 9 

and 

y = r sin 9 

where is the distance from the origin to (x, y): 


r = \J x 2 + y 1 . 

Substituting into the equation of the ellipse we obtain 


r 2 cos 2 9 r 2 sin 2 I 


from which we obtain the desired result: 

r{6) = 


b 2 
1 


= 1 


(cos9\ 2 ( sin / ' x 2 ' 


j!/2' 

\ a J ' \ I, -) j 

When b = a, we have the familiar equation of a circle, r(9) = a, or x 2 +y 2 = a 2 . 


Plots of the three signatures just derived are shown in Fig. PI 1.6(d)-(f). 
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Problem 11.8 

(a) In the first case, N(p) = 5, S(p) = 1 , p 2 ■ Pa • Pq = 0, and pA • p6 ■ p8 = 0, so 
Eq. (11.1-1) is satisfied and p is flagged for deletion. In the second case, N(p) = 1, 
so Eq. (11.1-1) is violated and p is left unchanged. In the third case p2 ■ p4 • p6 1 
and pA ■ p6 ■ p8 = 1, so conditions (c) and (d) of Eq. (11.1-1) are violated and p is 
left unchanged. In the forth case S(p) = 2, so condition (b) is violated and p is left 
unchanged. 

(b) In the first case p‘2 ■ p(> ■ />8 = 1 so condition ( d ’ ) in Eq. ( 1 1 . 1 -3) is violated and p is 
left unchanged. In the second case N(p) = 1 so p is left unchanged. In the third case 
(c’) and (d’) are violated and p is left unchanged. In the fourth case S(p) = 2 and p is 
left unchanged. 

Problem 11.9 


(a) The result is shown in Fig. 1 1 .9(b). (b) The result is shown in Fig. 1 1 .9(c). 


• • 
• • 
• • 


(a) 


(to 

Figure PI 1.9 


(c) 


Problem 11.10 


(a) The number of symbols in the first difference is equal to the number of segment 
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primitives in the boundary, so the shape order is 12. 


(b) Starting at the top left corner. 

Chain code: 
Difference: 
Shape no.: 


000332123211 

300303311330 

003033113303 


Problem 11.11 

With reference to the discussion in Section 4.6.1, the DFT can be real only if the data 
sequence is conjugate symmetric. Only contours that are symmetric with respect to the 
origin have this property. The axis system of Fig. 11.13 would have to be set up so that 
this condition is satisfied for symmetric figures. This can be accomplished by placing 
the origin at the center of gravity of the contour. 

Problem 11.12 

The mean is sufficient. 

Problem 11.13 

Two ellipses with different, say, major axes, have signatures with the same mean and 
third statistical moment descriptors (both due to symmetry ) but different second moment 
(due to spread). 

Problem 11.14 


This problem can be solved by using two descriptors: holes and the convex deficiency 
(see Section 9.5.4 regarding the convex hull and convex deficiency of a set). The deci- 
sion making process can be summarized in the form of a simple decision, as follows: If 
the character has two holes, it is an 8. If it has one hole it is a 0 or a 9. Otherwise, it is 
a 1 or an X. To differentiate between 0 and 9 we compute the convex deficiently. The 
presence of a ’’significant” deficiency (say, having an area greater than 20% of the area 
of a rectangle that encloses the character) signifies a 9; otherwise we classify the char- 
acter as a 0. We follow a similar procedure to separate a 1 from an X. The presence of 
a convex deficiency with four components whose centroids are located approximately in 
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the North, East, West, and East quadrants of the character indicates that the character is 
an X. Otherwise we say that the character is a 1 . This is the basic approach. Imple- 
mentation of this technique in a real character recognition environment has to take into 
account other factors such as multiple ’’small” components in the convex deficiency due 
to noise, differences in orientation, open loops, and the like. However, the material in 
Chapters 3, 9 and 11 provide a solid base from which to formulate solutions. 


Problem 11.15 


We can use the position operator P : ”2 m pixels to the right and 2m pixels below.” Other 
possibilities are P : ”2m pixels to the right,” and P : ”2m pixels below.” The first choice 
is better in terms of retaining the ’’flavor” of a checkerboard. 


Problem 11.16 


(a) The image is 

0 10 10 

10 10 1 
0 10 10 . 

10 10 1 
0 10 10 

Let z i=0 and = 1. Since there are only two gray levels the matrix A is of order 
2x2. Element an is the number of pixels valued 0 located one pixel to the right of a 0. 
By inspection, an = 0. Similarly, a 12 = 10, a 2 i = 10, and 022 = 0. The total number 
of pixels satisfying the predicate P is 20, so 


C = 


0 1/2 
1/2 0 


(b)In this case, an is the number of 0’s two pixels to the right of a pixel valued 0. By 
inspection, an = 8. Similarly, 012 = 0, 021 = 0, and 022 = 7. The number of pixels 
satisfying P is 15, so 


8/15 0 

0 7/15 


Problem 11.17 


When assigning this problem, the Instructor may wish to point the student to the review 
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of matrices and vectors in the book web site. 


FromEq. (11.4-6), 
Then, 


y = A(x — m x ). 

m y = E{ y} = E{ A(x - m x )} 
= A[i?{x} — i?{m x }] 

= A[m x - m x ] 

= 0 . 


This establishes the validity of Eq. (11.4-7). 


To prove the validity of Eq. (11.4-8), we start with the definition of the covariance matrix 
given in Eq. ( 1 1 .4-3): 

Cy = ^{(y - m y)(y - m y) T }- 
Since m y = 0, it follows that 

Cy = E{yyT} 

= -E{[A(x — m x )][A(x — m x )] T } 

= AE{(x- m x )(x- m x ) T }A T 

= ac x a t . 

Showing the validity of Eq. (1 1.4-9) is a little more complicated. We start by noting that 
covariance matrices are real and symmetric. From basic matrix algebra, it is known that 
a real symmetric matrix of order n has n linearly independent eigenvectors (which are 
easily orthonormalized by, say, the Gram-Schmidt procedure). The rows of matrix A 
are the orthonormal eigenvectors of C x . Then. 

C X A (3 x [gi, G2> • • • 

[C x Gi, C x e 2 , . . . C x e„] 

= [Aiei, A 2 e 2 , . . . , A n e n ] 

= A t D 

where use was made of the definition of an eigenvector (i.e., C x e, = A,e,) and D is a 
diagonal matrix composed of the eigenvalues of C x : 

' Ai 0 • • • 0 

0 A 2 • • • 0 

D = 

.0 0 ••• A n 
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Premultiplying both sides of the preceding equation by matrix a gives 

ac x a t = aa t d 

= D 

where we used the fact that A 1 A = A A 7 = I because the rows of A are orthonormal 
vectors. Thus, since, C y = AC x A r , we have shown that C y is a diagonal matrix 
which is produced by diagonalizing matrix C x using a transformation matrix composed 
of its eigenvectors. The eigenvalues of C y are seen to be the same as the eigenvalues 
of C x . (Recall that the eigenvalues of a diagonal matrix are its diagonal terms). The 
fact that C y e, = Do, = A, e, shows that the eigenvectors of C y are equal to the 
eigenvectors of C x . 


Problem 11.18 


The mean square error, given by Eq. (11.4-12), is the sum of the eigenvalues whose 
corresponding eigenvectors are not used in the transformation. In this particular case, 

the four smallest eigenvalues are applicable (see Table 1 1 .5), so the mean square error is 

6 

e ms = A j = 280. 

?= 3 

The maximum error occurs when A' = OinEq. (1 1.4-12) which then is the sum of all the 
eigenvalues, or 4421 in this case. Thus, the error incurred by using the two eigenvectors 
corresponding to the largest eigenvalues is only 6.3 % of the total possible error. 


Problem 11.19 


This problem is similar to the previous one. The covariance matrix is of order 4096 x 
4096 because the images are of size 64 x 64. It is given that the covariance matrix is the 
identity matrix, so all its 4096 eigenvalues are equal to 1 . From Eq. (11.4-12), the mean 
square error is 

&ms — 


4096 2048 



j=l i=l 

2048. 


Problem 11.20 


When the boundary is symmetric about the both the major and minor axes and both axes 
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intersect at the centroid of the boundary. 


Problem 11.21 


A solution using the relationship ’’connected to,” is shown in Fig. PI 1.21. 

1 



Figure PI 1.21 


Problem 11.22 


We can compute a measure of texture using the expression 

R(x, y) = 1 - — — 

1 + u 2 {x, y) 

where a 2 (x,y) is the gray-level variance computed in a neighborhood of (x, y). The 
size of the neighborhood must be sufficiently large so as to contain enough samples to 
have a stable estimate of the mean and variance. Neighborhoods of size 7 x 7 or 9 x 9 
generally are appropriate for a low-noise case such as this. 


Since the variance of normal wafers is known to be 400, we can obtain a normal value 
for R(x,y) by using cr 2 = 400 in the above equation. An abnormal region will have 
a variance of about (50) 2 = 2,500 or higher, yielding a larger value of R(x,y). The 
procedure then is to compute R(x,y) at every point (x,y) and label that point as 0 if 
it is normal and 1 if it is not. At the end of this procedure we look for clusters of 1 ’s 
using, for example, connected components (see Section 9.5.3 regarding computation of 
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connected components) . If the area (number of pixels) of any connected component 
exceeds 400 pixels, then we classify the sample as defective. 


Problem 11.23 


This problem has four major parts. ( 1 ) Detecting individual bottles in an image; (2) 
finding the top each bottle; (3) finding the neck and shoulder of each bottle; and (4) 
determining the level of the liquid in the region between the neck and the shoulder. 

(1) Finding individual bottles. Note that the background in the sample image is much 
darker than the bottles. We assume that this is true in all images. Then, a simple way 
to find individual bottles is to find vertical black stripes in the image having a width de- 
termined by the average separation between bottles, a number that is easily computable 
from images representative of the actual setup during operation. We can find these 
stripes in various ways. One way is to smooth the image to reduce the effects of noise 
(we assume that, say, a3x3or5x5 averaging mask is sufficient). Then, we run a hor- 
izontal scan line through the middle of the image. The low values in the scan line will 
correspond to the black or nearly black background. Each bottle will produce a sig- 
nificant rise and fall of gray level in the scan line for the width of the bottle. Bottles 
that are fully in the field of view of the camera will have a predetermined average width. 
Bottles that are only partially in the field of view will have narrower profiles, and can 
be eliminated from further analysis (but we need to make sure that the trailing incom- 
plete bottles are analyzed in the next image; presumably, the leading partial bottle was 
already processed.). 

(2) Finding the top of each bottle. Once the location of each (complete or nearly com- 
plete) bottle is determined, we again can use the contrast between the bottles and the 
background to find the top of the bottle. One possible approach is to compute a gradi- 
ent image (sensitive only to horizontal edges) and look for a horizontal line near the top 
of the gradient image. An easier method is to run a vertical scan line through the cen- 
ter of the locations found in the previous step. The first major transition in gray level 
(from the top of the image) in the scan line will give a good indication of the location of 
the top of a bottle. 

(3) Finding the neck and shoulder of a bottle. In the absence of other information, we 
assume that all bottles are of the same size, as shown in the sample image. Then, once 
we now where the top of a bottle is, the location of the neck and shoulder are known to 
be at a fixed distance from the bottle top. 
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(4) Determining the level of the liquid. The area defined by the bottom of the neck and 
the top of the shoulder is the only area that needs to be examined to determine acceptable 
vs. unacceptable fill level in a given bottle. In fact. As shown in the sample image, an 
area of a bottle that is void of liquid appears quite bright in an image, so we have various 
options. We could run a single vertical scan line again, but note that the bottles have 
areas of reflection that could confuse this approach. This computation is at the core 
of what this system is designed to do, so a more reliable method should be used. One 
approach is to threshold the area spanning a rectangle defined by the bottom of the neck, 
the shoulder, and sides of the bottle. Then, we count the number of white pixels above 
the midpoint of this rectangle. If this number is greater than a pre-established value, we 
know that enough liquid is missing and declare the bottle improperly filled. A slightly 
more sophisticated technique would be to actually find the level of the liquid. This 
would consist of looking for a horizontal edge in the region within the bottle defined by 
the sides of the bottle, the bottom of the neck, and a line passing midway between the 
shoulder and the bottom of the neck. A gradient/edge-linking approach, as described in 
Sections 10.1 and 10.2 would be suitable. Note however, that if no edge is found, the 
region is either filled (dark values in the region) or completely void of liquid (white, or 
near white values in the region). A computation to resolve these two possible conditions 
has to follow if the system fails to find an edge. 


Problem 11.24 


The key specification of the desired system is that it be able to detect individual bubbles. 
No specific sizes are given. We assume that bubbles are nearly round, as shown in the 
test image. One solution consists of (1) segmenting the image; (2) post-processing the 
result; (3) finding the bubbles and bubble clusters, and determining bubbles that merged 
with the boundary of the image; (4) detecting groups of touching bubbles; (5) counting 
individual bubbles; and (6) determining the ratio of the area occupied by all bubbles to 
the total image area. 

( 1 ) Segmenting the image. We assume that the sample image is truly representative of 
the class of images that the system will encounter. The image shown in the problem 
statement is typical of images that can be segmented by a global threshold. As shown 
by the histogram in Fig. PI 1.24, the gray level of the objects of interest is high on the 
gray scale. A simple adaptive threshold method for data that is that high on the scale is 
to choose a threshold equal to the mean plus a multiple of the standard deviation. We 
chose a threshold equal to m + 2a, which, for the image in the problem statement, was 
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195. The segmented result is shown on the right of Fig. PI 1.24. Obviously this is 
not the only approach we could take, but this is a simple method that adapts to overall 
changes in intensity. 

(2) Post-processing. As shown in the segmented image of Fig. PI 1.24, many of the 
bubbles appear as broken disks, or disks with interior black components. These are 
mostly due either to reflection (as in Fig. 9.16) or actual voids within a bubble. We 
could attempt to build a procedure to repair and/or fill the bubbles (as in Problem 9.23). 
Flowever, this can turn into a computationally expensive process that is not warranted 
unless stringent measurement standards are required, a fact not mentioned in the problem 
statement. An alternative is to calculate, on the average (as determined from a set of 
sample images), the percentage of bubble areas that are filled with black or have black 
’’bays” which makes their black areas merge with the background. Then, once the 
dimensions of each bubble (or bubble cluster) have been established, a correction factor 
based on area would be applied. 

(3) Finding the bubbles. Refer to the solution to Problem 9.27. The solution is based 
on connected components, which also yields all bubbles and bubble clusters. 

(4) In order to detect bubble clusters we make use of shape analysis. For each con- 
nected component, we find the eigen axes (see Section 11.4) and the standard deviation 
of the data along these axes (square root of the eigenvalues of the covariance matrix). 
One simple solution is to compute the ratio of the large to the small variance of each 
connected component along the eigen axes. A single, uniformly-filled, perfectly round 
bubble will have a ratio of 1 . Deviations from 1 indicate elongations about one of the 
axes. We look for elliptical shapes as being formed by clusters of bubbles. A threshold 
to classify bubbles as single vs. clusters has to be determined experimentally. Note that 
single pixels or pixel streaks one pixel wide have a standard deviation of zero, so they 
must be processed separately. We have the option of considering connected components 
that consist of only one pixel to be either noise, or the smallest detectable bubble. No 
information is given in the problem statement about this. In theory, it is possible for a 
cluster to be formed such that its shape would be symmetrical about both axes, in which 
case the system would classify the cluster as a single bubble. Resolution of conflicts 
such as this would require additional processing. Flowever, there is no evidence in the 
sample image to suggest that this in fact is a problem. Bubble clusters tend to appear 
as elliptical shapes. In cases where the ratio of the standard deviations is close to the 
threshold value, we could add additional processing to reduce the chances of making a 
mistake. 
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(5) Counting individual bubbles. A bubble that does not merge with the border of the 
image or is not a cluster, is by definition a single bubble. Thus, counting these bubbles 
is simply counting the connected components that have not been tagged as clusters or 
merged with the boundary of the image. 

(6) Ratio of the areas. This ratio is simply the number of pixels in all the connected 
components plus the correction factors mentioned in (2), divided by the total number of 
pixels in the image. 

The problem also asks for the size of the smallest bubble the system can detect. If, as 
mentioned in (4), we elect to call a one-pixel connected component a bubble, then the 
smallest bubble dimension detectable is the physical size of one pixel. From the problem 
statement, 700 pixels cover 7 cm, so the dimension of one pixel is 10 mm. 



Figure P11.24 
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Problem 12.1 

(a) By inspection, the mean vectors of the three classes are, approximately, ni! = 
(1.5,0.3) t , m2 = (4.3, 1.3) T , and m3 = (5.5, 2.1) T for the classes Iris setosa, ver- 
sicolor, and virginica, respectively. The decision functions are of the form given in Eq. 
(12.2-5). Substituting the above values of mean vectors gives: 

di(x) = mi — ^mfmi = 1.5a:i + 0.3a , 2 — 1-2 

rf 2 (x) = x T m 2 — im^m 2 = 4. 3a: 1 + 1.3x 2 — 10.1 

d 3 (x) = x 7 m 3 — imjm 3 = 5.5x 3 + 2.1x2 — 17-3 

(b) The decision boundaries are given by the equations 

d \2 (x) = di(x) — c?2(x) =— 2.8xi — 1.0x2 + 8.9 = 0 

rfi 3 (x) = di(x) — d 3 (x) = —4. Oxi — 1.8x2 + 16.1 = 0 

^23 ( x ) = d 2 (x) - d 3 (x) = -1.2xi - 0.8x2 + 7.2 = 0 

A plot of these boundaries is shown in Fig. P12.1. 



d i2 w=0 Figure P12.1 


http://librosysolucionarios.net 



182 Chapter 12 Problem Solutions 


Problem 12.2 


From the definition of the Euclidean distance, 

Dj (x) = ||x — nijll = [(x-mj) T (x-mj)] 1/2 
Since D 3 (x) is non-negative, choosing the smallest Dj(x) is the same as choosing the 
smallest .D 2 (x), where 

Dj(x) = ||x — mj|| 2 = (x — nij) T (x — nij) 

= x 2 x — 2x 7 m, + mj m, 

T ( 'T 1 / 7 ~» 

= x x — 2 x m, m m, 

V J 2 J J ) 

We note that the term x 7 x is independent of j (that is, it is a constant with respect to j in 
D~ (x), j = 1, 2, ...). Thus, choosing the minimum of D? (x) is equivalent to choosing 
the maximum of (x 7 m, — 77 m 7 in, ) . 


Problem 12.3 


The equation of the decision boundary between a pair of mean vectors is 

dy(x) = x T (m, - mj) - - mjmj) 


The midpoint between mj and m, is (mj + m_, ) / 2 (see Fig. P12.3) . First, we show that 
this point is on the boundary by substituting it for x in the above equation and showing 
that the result is equal to 0: 



T \ 
- m ( mj ) 


L t T T 

- 2 ( m j m < - 1,1 , ">/ 


1 / T T \ 

2 ( m i ^ m j m i) 

1 / T T \ 

2 ' ln , m j) 


= 0 


Next, we show that the vector (mj — mj) is perpendicular to the hyperplane boundary. 
There are several ways to do this. Perhaps the easiest is to show that (mj — mj ) is in 
the same direction as the unit normal to the hyperplane. For a hyperplane with equation 

W\X\ + w 2 x 2 + ...w n x n + w n+ \ = 0, the unit normal is 


where w Q = (wi, w 2 , ■■■, w n ) T . Comparing the above equation for djj(x) with the 
general equation of a hyperplane just given, we see that w Q = ( m, — mj) and w n+ 1 = 
— (mf mj — m 7 mj )/2 . Thus, the unit normal of our decision boundary is 

u (mj - mj ) 

ll m i - m jll 
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which is in the same direction as the vector (m., — m, ) . This concludes the proof. 



Figure P12.3 


Problem 12.4 


The solution is shown in Fig. P12.4, where the x’s are treated as voltages and the F’s 
denote impedances. From basic circuit theory, the currents, /’ s, are the products of the 
voltages times the impedances. 



n+ 1 


= 1 


| 1 f n 


j. n+ 1 
1 


1 T 

- m in 

2 1 J 


1 _ T . 


k= 1 

T 1 T 

z m , - - D.m, 
j 2 J J 


Figure P12.4 
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Problem 12.5 


Assume that the mask is of size J x K . For any value of displacement (s, f), we can 
express the area of the image under the mask, as well as the mask w(x, y), in vector form 
by letting the first row of the subimage under the mask represent the first K elements of 
a column vector a, the elements of next row the next K elements of a, and so on. At the 
end of the procedure we subtract the average value of the gray levels in the subimage 
from every element of a. The vector a is of size ( J x K ) x 1. A similar approach 
yields a vector, b, of the same size, for the mask w(x, y ) minus its average. This vector 
does not change as (s, t) varies because the coefficients of the mask are fixed. With this 
construction in mind, we see that the numerator of Eq. (xx.3-8) is simply the vector 
inner-product a 7 b. Similarly, the first term in the denominator is the norm squared of 
a, denoted a 1 a = ||a||“, while the second term has a similar interpretation for b. The 
correlation coefficient then becomes 


7(M) = 


a T b 


(a T a)(b T b) 


1/2 


When a = b (a perfect match), 7 (s, f) = ||a|| / ||a|| ||a|| = 1, which is the maximum 
value obtainable by the above expression. Similarly, the minimum value occurs when 
a = — 6 , in which case 7 (s, t ) = — 1. Thus, although the vector a varies in general for 
every value of (s, t ), the values of 7 (s, t) are all in the range [— 1 , 1 ], 


Problem 12.6 


The solution to the first part of this problem is based on being able to extract connected 
components (see Chapters 2 and 11) and then determining whether a connected com- 
ponent is convex or not (see Chapter 11). Once all connected components have been 
extracted we perform a convexity check on each and reject the ones that are not convex. 
All that is left after this is to determine if the remaining blobs are complete or incom- 
plete. To do this, the region consisting of the extreme rows and columns of the image is 
declared a region of 1 ’s. Then if the pixel-by-pixel AND of this region with a particu- 
lar blob yields at least one result that is a 1 , it follows that the actual boundary touches 
that blob, and the blob is called incomplete. When only a single pixel in a blob yields 
an AND of 1 we have a marginal result in which only one pixel in a blob touches the 
boundary. We can arbitrarily declare the blob incomplete or not. From the point of view 
of implementation, it is much simpler to have a procedure that calls a blob incomplete 
whenever the AND operation yields one or more results valued 1 . 
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After the blobs have been screened using the method just discussed, they need to be 
classified into one of the three classes given in the problem statement. We perform the 
classification problem based on vectors of the form x = (x\ . x^ ) 7 , where X\ and a: 2 are, 
respectively, the lengths of the major and minor axis of an elliptical blob, the only type 
left after screening. Alternatively, we could use the eigen axes for the same purpose. 
(See Section 11.2.1 on obtaining the major axes or the end of Section 1 1.4 regarding the 
eigen axes.) The mean vector of each class needed to implement a minimum distance 
classifier is really given in the problem statement as the average length of each of the two 
axes for each class of blob. If‘ they were not given, they could be obtained by measuring 
the length of the axes for complete ellipses that have been classified a priori as belonging 
to each of the three classes. The given set of ellipses would thus constitute a training set, 
and learning would simply consist of computing the principal axes for all ellipses of one 
class and then obtaining the average. This would be repeated for each class. A block 
diagram outlining the solution to this problem is straightforward. 


Problem 12.7 


(a) Since it is given that the pattern classes are governed by Gaussian densities, only 
knowledge of the mean vector and covariance matrix of each class are required to specify 
the Bayes classifier. Substituting the given patterns into Eqs. (12.2-22) and (12.2-23) 
yields 


mi 


m 2 


Ci 


1 

1 

5 

5 



and 


Since Ci = C 2 


C 2 


1 0 
0 1 


= c 


-1 

2 


I, the decision functions are the same as for a minimum distance 


classifier: 


and 


di(x) = x T mi — ^mfmi = l.Oxi + l.Cte 2 — 1.0 
d 2 (x) = x 2 m 2 — ^m 2 m 2 = 5.0;ri + 5.0a , 2 — 25.0 
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Problem 12.8 


The Bayes decision boundary is given by the equation c/(x) = d\ (x) — d 2 (x) 
d(x) = — 4.0;ri — 4.0a , 2 + 24.0 = 0 

(b) A plot of the boundary is shown in Fig. P12.7. 



(a) As in Problem 12.7, 


Cr 


and 


C, = 2 


1 0 
0 1 

1 0 ’ 
0 1 


mi 


mi = 


cr 1 


c^ 1 


1 0 
0 1 

1 0 
0 1 


|Ci| = 0.25 


I C 2 1 =4.00 


Since the covariance matrices are not equal, it follows from Eq. (12.2-26) that 
d i(x) = 


and 


>(°- 25 >4 


= -^ln(0.25) - {x\ +xl) 


= -iln(4.00) - ^(x\ +x%) 


1 T 

’ 2 

0 ’ 

1 

l 

0 

2 

j 


( T 

' 0.5 

0 

1 

l* 

0 

0.5 

x ] 
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where the term In P(uij) was not included because it is the same for both decision 
functions in this case. The equation of the Bayes decision boundary is 

d(x) = rfi(x) — rf 2 (x) = 1.39 — + x\) = 0. 

(b) A plot of the boundary is shown in Fig. P12.8. 



Problem 12.9 


The basic mechanics are the same as in Problem 12.6, but we have the additional re- 
quirement of computing covariance matrices from the training patterns of each class. 


Problem 12.10 


From basic probability theory, 

P(c) = £p(c/x)p(x). 

X 

For any pattern belonging to class uj, p(c/~x) = pfcCj/x). Therefore, 

P( c ) = ^pK/x)p(x). 

X 

Substituting into this equation the formula p(cl> ; /x) = p(’x./uij)p(uij)/p{x.) gives 

P( c ) = ^p( x M)p(o ; J ). 

X 

Since the argument of the summation is positive, p(c) is maximized by maximizing 
p(x/u>j)p(u>j) for each j. That is, if for each x we compute p(pc/u>j)p(ujj) for j = 
1,2,..., W, and use the largest value each time as the basis for selecting the class from 
which x came, then p(c) will be maximized. Since p(e) = 1 — p(c), the probability of 
error is minimized by this procedure. 
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Problem 12.11 

(a) For class w-, we let y(l) = (0, 0, 0, 1) T , y(2) = (1, 0, 0, 1) T , y(3) = (1, 0, 1, 1 )T, 
y(4) = (1, 1, 0, 1 )T. Similarly, for class 0 * 2 , y(5) = (0, 0, 1, 1) T , y(6) = (0, 1, 1, 1) T , 
y(7) = (0, 1,0, 1) T , y(8) = (1, 1, 1, 1) T . Then, using c = 1 and 

w (l) = (—1, —2, — 2,0) T 

it follows from Eqs. (12.2-34) through ( 12.2-36) that: 

w (l) T y(l) = 0, w(2) = w(l) + y(l) = (-1, -2, -2, 1) T ; 

w (2) T y(2) = 0, w(3) = w(2) + y (2) = (0, -2, -2, 2) T ; 

w(3) i y(3) = 0, w(4) = w(3) + y(3) = (1, -2, -1, 3) T ; 

w (4) T y(4) = 2, w(5) = w(4) = (1, -2,-1, 3) T ; 

w(5) T y(5) = 2, w(6) = w(5) - y(5) = (-1, -2, -2, 2) T ; 

w (6) T y(6) = -2, w(7) = w(6) = (-1, -2, -2, 2) T ; 

w ( 7 ) T y( 7 ) = 0, w(8) = w(7) - y(7) = (1, -3, -2, 1) T ; 

w(8) T y(8) = -3, w(9) = w(8) = (1, -3, -2, 1) T . 

Since a complete iteration through all patterns without an error was not achieved, the 
patterns are recycled by letting y(9) = y(l), y(10) = y(2), and so on, which gives 
w (9) T y(9) = 1, w(10) = w(9) = (1, -3, -2, 1) T ; 

w (10) T y(10) = 2, w(ll) = w(10) = (1, -3, -2, 1) T ; 

w(ll) T y(ll) = 0, w(12) = w(ll) +y(ll) = (2, -3, -1, 2) T ; 

w (12) r y(12) = 1, w(13) = w(12) = (2, -3,-1, 2) T ; 

w (13) r y(13) = 1, w(14) = w(13) - y(13) = (2, -3, -2, 1) T ; 

w (14) r y(14) = -4, w(15) = w(14) = (2, -3, -2, 1) T ; 

w (15) r y(15) = -2, w(16) = w(15) = (2, -3, -2, 1) T ; 

w (16) r y(16) = -2, w(17) = w(16) = (2, -3, -2, 1) T . 

Again, since a complete iteration over all patterns without an error was not achieved, the 
patterns are recycled by letting y(17) = y(l), y(18) = y(2), and so on, which gives: 
w (17) T y(17) = 1, w(18) = w(17) = (2, -3, -2, 1) T ; 

w (18) T y(18) = 3, w(19) = w(18) = (2, -3, -2, 1) T ; 

w (19) T y(19) = 1, w(20) = w(19) = (2, -3, -2, 1) T ; 

w(20) T y(20) = 0, w(21) = w(20) + y(20) = (3, -2, -2, 2) r ; 

w (21) T y(21) = 0, w(22) = w(21) - y(21) - (3, -2, -3, 1) T . 

It is easily verified that no more corrections take place after this step, so w(22) = 
(3, —2, —3, 1) T is a solution weight vector. 

(b) The decision surface is given by the equation 

w T y = 3t/i - 2y 2 - 3 j/ 3 + 1 = 0 
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A section of this surface is shown schematically in Fig. P12.1 1. The positive side of the 
surface faces the origin. 



Problem 12.12 


We start by taking the partial derivative of J with respect to w: 

^ = \ [ysgn(w T y) - y] 

where, by definition, sgn(w J y) = 1 if w T y > 0, and sgn(w T y) = —1 otherwise. 
Substituting the partial derivative into the general expression given in the problem state- 
ment gives 


w (k + 1) = w (k) + jy(k) - y(k)sgn w(k) r y(k)j j 
where y(k) is the training pattern being considered at the /, th iterative step. Substituting 
the definition of the sgn function into this result yields 


w (k + 1) = w(fc) + c | ^ 

where c > 0 and w(l) is arbitrary. This expression 
the problem statement. 


if w(k) r y(k) 
otherwise 

agrees with the formulation given in 


Problem 12.13 


Let the training set of patterns be denoted by yi, y 2 , ■ ■ ■ , y at. It is assumed that the 
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training patterns of class u >2 have been multiplied by —1. If the classes are linearly 
separable, we want to prove that the perceptron training algorithm yields a solution 
weight vector, w*, with the property 

w* T y i > T 0 

where T 0 is a nonnegative threshold. With this notation, the Perceptron algorithm (with 
c = 1) is expressed as w (k + 1) = w (k) if w 1 (k)yi(k) > To or w (k + 1) = w (k) + 
Yi(k) otherwise. 


Suppose that we retain only the values of k for which a correction takes place (these are 
really the only indices of interest). Then, re-adapting the index notation, we may write 

w (k + 1) = w (k) +y i(k) 

and 

w T (fc)y i(k) < T 0 

With these simplifications in mind, the proof of convergence is as follows: From the 
above equation, 

w (k + 1) = w(l) + y*(l) + y*(2) H F y i(k) 

Taking the inner product of the solution weight vector with both sides of this equation 
gives 

W T (k + l)w* = w T (l)w* + y f (l)w* + yf (2)w* -| F yf (k) w* 

Each term yf (j) w*, j = 1, 2, ..., k, is less than T 0 , so 

w T (k + l)w* > w 1 (l)w* + kTo 

Using the Cauchy-Schwartz inequality, ||a|| 2 ||b || 2 > (a 1 b) 2 , results in 
[w 2 ( k + l)w*] 2 < ||w T (fc + 1)|| 2 ||w* || 2 


or 


w T (fc + l) > 


w T (k + l)w* 


Another line of reasoning leads to a contradiction regarding w T (k -F 1) . From 


above. 




IMj + i)II 2 = ll w (i) 

2 + 2w T (j)y.j(j) + 1 1 y i ( J ) 1 1 

or 




ll w (i + 1 )ll 2 - ll w (i)ll 

2 = 2w T (j)yi{j) + ||yi(i)|| 

Let Q = max| 

i 

1 y * ( J ) 1 1 2 - Then, since w 1 

(j)yi(j) < T 0 , 


ll w (j + 1 )l| 2 - 

w(j) 2 < 2T 0 + Q 

Adding these inequalities for j = 1,2,.. 

. , k yields 


||w(j + l)|| 2 < | 

w (l) | 2 + [2T 0 -F Q] k 


http://librosysolucionarios.net 



Problem 12.14 191 


This inequality establishes a bound on ||w(j + 1) || 2 2 that conflicts for sufficiently large 
k with the bound established by our earlier inequality. In fact, k can be no larger than 
k rn , which is a solution to the equation 

[w T (fc+l)w* + fc TO T 0 ] 2 n ^ ||2 _ ^ ( 

^5 — ll w (l)l + [2 To+Q]k m 

l w *lr 

This equation says that k m is finite, thus proving that the perceptron training algorithm 
converges in a finite number of steps to a solution weight vector w* if the patterns of the 
training set are linearly separable. 


Note: The special case with To = 0 is proved in a slightly different manner. Under this 
condition we have 

w 1 (k + l)w* > w 2 (l)w* + ka 

where 

a = min [yf (j) w*] 

l 

Since, by hypothesis, w* is a solution weight vector, we know that [yf(j)w*] > 0. 
Also, since w 2 (j)yi(j) < (T = 0), 

ll w (i + !)l ! 2 - ll w (i)ll 2 < l|yi(i)ll 2 

< Q. 


The rest of the proof remains the same. The bound on the number of steps is the value 
of k rn that satisfies the following equation: 


w T (l)w* + k m a ‘ 


w(l)|| 2 + Qk m 


Problem 12.14 


The single decision function that implements a minimum distance classifier for two 
classes is of the form 

dij(x) = x T ( mi - m, ) - - (m[m, - mjm,). 

Thus, for a particular pattern vector x, when d,j (x) > 0, x is assigned to class u>\ and. 
when d tJ (x) < 0, x is assigned to class u) 2 - Values of x for which dy (x) = 0 are on 
the boundary (hyperplane) separating the two classes. By letting w = (m; — ny ) and 
w n +i = — mi — mj m , ) , we can express the above decision function in the form 

d(x) = w T x - w n+1 . 

This is recognized as a linear decision function in n dimensions, which is implemented 
by a single layer neural network with coefficients 

w k = {m ik -m jk ) k = 1,2, ... ,n 
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and 


Problem 12.15 


W n + 1 = --(mfm i - mjmj) 


The approach to solving this problem is basically the same as in Problem 12.14. The 
idea is to combine the decision functions in the form of a hyperplane and then equate 
coefficients. For equal covariance matrices, the decision function for two pattern classes 
is obtained Eq. (12.2-27): 

dij (x) = di (x) — dj (x) = In P(u)i ) — In P(cjj ) + x J C 1 (m* — in, ) 

-i(mi - m J ) T Cr 1 (m i - mj). 

As in Problem 12. 14, this is recognized as a linear decision function of the form 

d(x) = w T x - w n+ x 

which is implemented by a single layer perceptron with coefficients 

u’k=v k k = 1,2, ... ,n 


and 


9 = w n+ 1 = In P(ui) — In P(uJj ) + x 1 C 1 (m, — mj ) 

where the Vk are elements of the vector 


v = C '(itij — mj). 


Problem 12.16 


(a) When P{u>i) = P{uj) and C = I. 

(b) No. The minimum distance classifier implements a decision function that is the 
perpendicular bisector of the line joining the two means. If the probability densities are 
known, the Bayes classifier is guaranteed to implement an optimum decision function 
in the minimum average loss sense. The generalized delta rule for training a neural 
network says nothing about these two criteria, so it cannot be expected to yield the 
decision functions in Problems 12.14 or 12.15. 


Problem 12.17 


The classes and boundary needed to separate them are shown in Fig. P12. 17(a). The 
boundary of minimum complexity in this case is a triangle, but it would be so tight 
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in this arrangement that even small perturbations in the position of the patterns could 
result in classification errors. Thus, we use a network with the capability to implement 
4 surfaces (lines) in 2D. The network, shown in Fig. P12. 17(b), is an extension of the 
concepts discussed in the text in connection with Fig. 12.22. In this case, the output 
node acts like an AND gate with 4 inputs. The output node outputs a 1 (high) when 
the outputs of the preceding 4 nodes are all high simultaneously. This corresponds to a 
pattern being on the + side of all 4 lines and. therefore, belonging to class u )\ . Any other 
combination yields a 0 (low) output, indicating class u> 2 i ■ 


2 

x 2 

(b) 

Figure P12.17 




Problem 12.18 


All that is needed is to generate for each class training vectors of the form x = (x -\ , X 2 ) T , 
where x\ is the length of the major axis and x-j is the length of the minor axis of the blobs 
comprising the training set. These vectors would then be used to train a neural network 
using, for example, the generalized delta rule. (Since the patterns are in 2D, it is useful 
to point out to students that the neural network could be designed by inspection in the 
sense that the classes could be plotted, the decision boundary of minimum complexity 
obtained, and then its coefficients used to specify the neural network. In this case the 
classes are far apart with respect to their spread, so most likely a single layer network 
implementing a linear decision function could do the job.) 


Problem 12.19 


This problem, although it is a simple exercise in differentiation, is intended to help the 
student fix in mind the notation used in the derivation of the generalize delta rule. From 
Eq. (12.2-50), with 0 o = 1, 


hjilj) 


1 

^ g — [X^fczfr w jkOk~\-Qj 
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Since, from Eq. (12.2-48), 


it follows that 


N k 

Ij = WjkOk 

k= i 


hAh) = 


i 


3K 33 ~ 

Taking the partial derivative of this expression with respect to Ij gives 


h'j{ij) = - 


[Ij + @j] 


dij [i + e -[i j + e j ]-j 2 ' 


Front Eq. (12.2-49) 


It is easily shown that 


°3 — h j(Ij) - 1 + e - l i j +e i] - 

e~ IB + & 3 ] 

0,(1 - Oj) = 2 

3 3 [l + e-Vi+W ] 2 

h' j {Ij) = Oj{l-O j ) 


This completes the proof. 


Problem 12.20 


The first part of Eq. (12.3-3) is proved by noting that the degree of similarity, k, is non- 
negative, so D(A, B) : 1/k > 0. Similarly, the second part follows from the fact that 
k is infinite when (and only when) the shapes are identical. 


To prove the third part we use the definition of D to write 

D(A, C ) < max [D{A, B), D(B, C)] 


as 


or, equivalently. 


1 

- — < max 
Kac 


l l 



k ac > min [k ab , k bc ] 


where k, t is the degree of similarity between shape i and shape j. Recall from the de- 
finition that k is the largest order for which the shape numbers of shape i and shape j 
still coincide. As Fig. 12.24(b) illustrates, this is the point at which the figures ’’sepa- 
rate” as we move further down the tree (note that k increases as we move further down 
the tree). We prove that k ac > iniii[fe a f,, k bc ] by contradiction. For k ac < min[fc a b, k bc ] 
to hold, shape A has to separate from shape C before ( 1 ) shape A separates from shape 
B , and (2) before shape B separates from shape C, otherwise k ab < k ac or k bc < k ac , 
which automatically violates the condition k ac < min k bc \. But, if (1) has to hold. 
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then Fig. PI 2.20 shows the only way that A can separate from C before separating from 
B. This, however, violates (2), which means that the condition k ac < min[fc a b, kb c ] 
is violated (we can also see this in the figure by noting that k ac = ki, r which, since 
kbc < k a bi violates the condition). We use a similar argument to show that if (2) 
holds then (1) is violated. Thus, we conclude that it is impossible for the condition 
k ac < min[fc 0 b, kb c ] to hold, thus proving that k ac > niin[/c a 5 , kb c ] or, equivalently, that 
D(A, C) < ma x[D(A, B),D(B, C)]. 



A B 
Figure P12.20 


Problem 12.21 


Q = 0 implies that max(|A| , |B|) = M. Suppose that |A| > \B\. Then, it must follow 
that \A\ = M and. therefore, that M > \B\. But M is obtained by matching A and B, 
so it must be bounded by M < min(|^4 , |B|). Since we have stipulated that \A\ > |5|, 
the condition M < min(|^4| , |S|) implies M < |£?|. But this contradicts the above 
result, so the only way for max(|^4| , |i3|) = M to hold is if \A\ = \B\. This, in turn, 
implies that A and B must be identical strings (A = B) because A = \B\ = M means 
that all symbols of A and B match. The converse result that if A = B then Q - 0 
follows directly from the definition of Q. 


Problem 12.22 


(a) An automaton capable of accepting only strings of the form ab n a > 1, shown in Fig. 
P12.22, is given by 

A f = ( Q,E,6,q 0 ,F ), 
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with 

Q = {90,91,92,93,90}, 

E = { a , b }, 

mappings 

6 { qo , a ) = {91}, 

%i,&) = {ft,®}, 

% 2 , a) = {<? 3 } 

and 

F = {93}. 

For completeness we write 

6 { q 0 , b ) = <5(9i, a) = 6 { q 2 , b ) = S ( q 3 , a ) = S { q 3 , b ) = <5(9 0 ,a) = <5(9 0 ,6) = {90}, 

corresponding to the null state. 

(b) To obtain the corresponding grammar we use the procedure discussed in Section 
12.3.3 under the heading Automata as string recognizers : 1. If q 3 is in < 5 ( 9 *, c), there 
is a production X, — > Xj in P; 2. If a state in F is in <5 ( </, . c), there is a production 
X, — > c in P. Normally, null state transitions are not included in the generation of 
productions. Using the results in (a) we obtain the grammar G = ( N , E, P, X f) ), with 
N = {Xq, Xi, X 2 }, E = {a, b}, and productions P = {Xo — > aX 1 , X\ — > 6 X 1 , 
X-, — , bX 2 , X 2 — , a}. 



Problem 12.23 


The patterns are of the form shown in the solution to Problem 11.2. (This problem is 
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Problem 12.24 


not starred, so a solution in not included in the book web site. If the problem was not 
assigned, it might be a good idea to give the solution in class). A possible expansive 
tree grammar is G — (IV, £, P,r, S), with N = {S,X i, X 2 , ■■■, X 6 }, S = {0,1}, 
r(0) = {0, 1, 2), r(l) = {0, 1, 2}, and the productions shown in Fig. P12.23: 


S' — 1 

Xi X 2 


X, — — 0 
I 

X, 


x,— o 

- /\ 

x 4 s 


X — 0 

/\ 

x 4 x 6 


x 3 — - 1 

X, 


X 4 ^ 1 


X— o 

5 I 

x 4 


X 5 — 0 


X,— 1 


X, 


Figure P12.23 


For the sample set R + = {aba, abba, abbba} it is easily shown that, for k = 1 and 2, 
h( A, R + , k) = 0, the null set. Since q 0 = h{ A, R + , k) is part of the inference procedure, 
we need to choose k large enough so that h(X. R + ,k) is not the null set. The shortest 
string in R + has three symbols, so k = 3 is the smallest value that can accomplish 
this. For this value of k, a trial run will show that one more string needs to be added 
to R + in order for the inference procedure to discover iterative regularity in symbol b. 
The sample string set then becomes R + = {aba, abba, abbba, abbbba}. Recalling that 
h(z, R + ,k) = {w |z'u;ini? + , |tc| < k} we proceed as follows: 

z = A, h(X,R + ,3) = {w |Atcini? + , |tc| < 3} 

= {aba} 

= 9o; 

z = a, h{a,R + , 3) = {w |awini? + , |io| < 3} 

= {ba, bba} 

= <?i; 
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z = ab, 

z = aba, 
z = abb, 
z = abba, 
z = abbb, 
z = abbba, 
z = abbbb, 
z = abbbba. 


h(ab, R + , 3 ) 

= {«) o6tt)inf? + , < 3 } 
= {a, 6a, 66a} 

= 92; 

h(aba, R + , 3 ) 

= (u> a 6 azaini?+, tn < 3 } 

= {A} 

= 93; 

h(abb,R + , 3 ) 

= {za \abbw in R + , zn < 3 } 
= {a, ba, 66 a} 

= 92; 

h(abba, R + , 3 ) 

= (zz; \abbaw in R + , za < 3 } 

= {A} 

= 93; 

h(abbb, R + , 3 ) 

= (zz; \abbbw in R + , zz; < 3 } 
= {a, 6a} 

= 94 ; 

h(abbba, R + ,3) 

= (zz; \ abbbaw in R + , zz; < 3 } 

= {A} 

= 93; 

h(abbbb, R + ,3) 

= (zz; \ abbbbw in R + , zz; < 3 } 

= (4 

= 95; 

h(abbbba,R + ,3) 

= (zz; \abbbbaw in R + , zz; < 3 } 

= {A} 

= 93; 


Other strings 2 in E* = (a, 6)* yield strings zw that do not belong to f? + , giving rise 
to another state, denoted q$, which corresponds to the condition that h is the null set. 
Therefore, the states are q 0 = {aba}, 91 = {ba,bba}, q 2 = {a,ba,bba}, q 3 = {A}, 
94 = {a, ba}, and q 5 = {a}, which gives the set Q = {g 0 , qi,q2, <?3, <?4, 95, 90}- 

The next step is to obtain the mappings. We start by recalling that, in general, q 0 = 
h( A, R + ,k). Also, in general, 

6{q, c) = {9' inQ \q' = h(zc, R + , k), with q = h(z, R+, k) }. 

In our case, 90 = h{ A, R + , 3 ) and, therefore, 

6{q 0 ,a) = h(Xa, R + , 3 ) = h(a,R + , 3 ) = {gi} = 91 
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and 

6(q 0 ,b) = h(Xb, R + , 3) = h(b,R + , 3) = {g 0 } = q 9 , 

where we have omitted the curly brackets for clarity in notation since the set contains 
only one element. Similarly, qi = h(a, R + , 3), and 

6(qi,a) = h(aa, R + , 3) = h(a, R + , 3) — g 0 , 

6(qi,b) = h(ab,R + ,3) = q 2 . 

Continuing in this manner gives q 2 = h(ab, R + , 3) = h(abb, R + , 3), 

6(q2,a) = h(aba, R + , 3) = h(abba, R + , 3) = q3, 

6{q 2 ,b) = h(abb,R + ,3) = q 2 , 

and, also, 

6(q 2 , b) = h(abbb, R + , 3) = < 74 . 

Next, q 3 = h(aba,R + , 3) = h(abba, R + , 3) = h(abbba, R + , 3) = h(abbbba, R + ,3), 

from which we obtain 

S{q 3 ,a) = h(abaa, R + , 3) = h(abbaa, R + , 3) 

= h(abbbaa, R + , 3) = h(abbbbaa, R + , 3) 

= <?0 

6(q3,b) = h(abab, R + , 3) = h(abbab, R + , 3) 

= h(abbbab, R + , 3) = h(abbbbab, R + , 3) 

= q%\ 

For the following state, <74 = h(abbb, R + , 3), 

<5(<74,a) = h(abbba, R + ,3) = q 3 , 

6(qi,b) = h(abbbb, R + , 3) = < 7 . 5 . 

Finally, for the last state, <75 = h(abbbb, R + , 3), and 

S(q 3 ,a ) = h(abbbba, R + , 3) = < 73 , 

S{q 5 ,b) = h(abbbbb, R + ,3) = g 0 . 

We complete the elements of the automaton by recalling that F = {q \<j in 0. A in <7 } = 
( 73 . We also include two remaining mappings that yield the null set: 6(q®,a) = 6{q<i,b) = 

q</>- 

Summarizing, the state mappings are: 

6{q 0 ,a) = qi,6(q 0 ,b) = g 0 ; 

6{qi,a) = qn,S{qi,b) = q 2 , 
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S(q 2 ,a) 

= 93,6(92,6) = 

%3,a) 

= 90,6(93,6) = 

<5(94, a) 

= 93, <5(94, 6) = 

<5(9 5 , a) 

= 93, <5(9-5, b) = 

<5(90, a) 

= 90,<5(90,&) = 


{92,94}; 

90; 

95; 

90; 

90- 


A diagram of the automaton is shown in Fig. PI 2.24. The iterative regularity on b is ev- 
ident in state < 72 - This automaton is not as elegant as its counterpart in Problem 12.22(a). 
This is not unexpected because nothing in the inference procedure deals with state min- 
imization. Note, however, that the automaton accepts only strings of the form ab n a , 
b > 1, as desired. The minimization aspects of a design generally follow inference and 
are based on one of several standard methods (see, for example, Gonzalez and Thoma- 
son [1978]). In this particular example, even visual inspection reveals that states q± and 
q 5 are redundant. 



Problem 12.25 


Consider the automaton related to Fig. 12.30, and the tree shown in Fig. 12.31(b). The 
explanation is simplified by moving up the tree one level at a time, starting at the lowest 
level. In this case the lowest level is in the innermost branch labeled with a’s. We start at 
its frontier node and assign state X\ to that node by virtue of f a . The next level contains 
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an a along that same branch, but its offspring now has been labeled X\. Assignment 
f a again indicates an assignment of X\ . We move up the tree in this manner. The 
assignments along all the single branches of a’s are X\ ’s, while those along the single 
branches of b ’ s are TVs. This continues until the automaton gets to the bottom of the 
single branch of a’s at the center of the tree. This particular a now has three offspring 
labeled X\ and three labeled AV which causes f a to assign state S to that a. As the 
automaton moves up one more level, it encounters another a. Since its offspring is S , 
f a assigns state S to it and moves up another level. It is evident that the automaton will 
end in state S when the last (root) node is processed. Since S is in F, the automaton in 
fact has accepted the tree in Fig. 12.31(b). 


Problem 12.26 


There are various possible approaches to this problem, and our students have shown over 
the years a tendency to surprise us with new and novel approaches to problems of this 
type. We give here a set of guidelines that should be satisfied by most practical solu- 
tions, and also offer suggestions for specific solutions to various parts of the problem. 
Depending on the level of maturity of the class, some of these may be offered as ’’hints” 
when the problem is assigned. 

Since speed and cost are essential system specifications, we conceptualize a binary ap- 
proach in which image acquisition, preprocessing, and segmentation are combined into 
one basic operation. This approach leads us to global thresholding as the method of 
choice. In this particular case this is possible because we can solve the inspection prob- 
lem by concentrating on the white parts of the flag (stars and white stripes). As discussed 
in Section 10.3.2, uniform illumination is essential, especially when global thresholding 
is used for segmentation. The student should mention something about uniform illumi- 
nation. or compensation for nonuniform illumination. A discussion by the student of 
color filtering to improve contrast between white and (red/blue/background) parts of an 
image is a plus in the design. 

The first step is to specify the size of the viewing area, and the resolution required to 
detect the smallest components of interest, in this case the stars. Since the images are 
moving and the exact location of each flag is not known, it is necessary to specify a field 
of view that will guarantee that every image will contain at least one complete flag. In 
addition, the frame rate must be fast enough so that no flags are missed. The first part of 
the problem is easy to solve. The field of view has to be wide enough to encompass an 
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area slightly greater across than two flags plus the maximum separation between them. 
Thus, the width, W, of the viewing area must be at least W = 2(5) + 2.05 = 12.1in. If 
we use a standard CCD camera of resolution 640 x 480 elements and view an area 12.8 
in. wide, this will give us a sampling rate of approximately 50 pixels/inch, or 250 pixels 
across a single flag. Visual inspection of a typical flag will show that the blue portion of 
a flag occupies about 0.4 times the length of the flag, which in this case gives us about 
100 pixels per line in the blue area. There is a maximum of six stars per line, and the 
blue space between them is approximately 1 .5 times the width of a star, so the number 
of pixels across a star is 100 /( [1 + 1-5] x 6) ~ 6 pixels/star. 

The next two problems are to determine the shutter speed and the frame rate. Since 
the number of pixels across each object of interest is only 6, we fix the blur at less 
than one pixel. Following the approach used in the solution of Problem 10.35, we first 
determine the distance between pixels as (12.8 in)/640 pixels = 0.02 in/pixel. The 
maximum speed of the flags is 21in/sec. At this speed, the flags travel 21/0.02 = 1, 050 
pixels/sec. We are requiring that a flag not travel more than one pixel during exposure; 
that is (1,050 pixels/sec) x T sec < 1 pixel. So, T < 9.52 x 10 -4 sec is the shutter 
speed needed. 

The frame rate must be fast enough to capture an image of every flag that passes the 
inspection point. Since it takes a flag (21 in/ sec) / (12.8 in) ~ 0.6 sec to cross the entire 
field of view we take a frame every 0.3 sec in order to guarantee that every image will 
contain a whole flag, and that no flag will be missed. We assume that the camera is 
computer controlled to fire from a clock signal. We also make the standard assumption 
that it takes 1/30 sec ~ 330 x 10 -4 sec to read a captured image into a frame buffer. 
Therefore, the total time needed to acquire an image is (330 + 9.5) x 10 -4 ~ 340 x 10 -4 
sec. Subtracting this quantity from the 0.3 sec frame rate leaves us with about 0.27 sec 
to do all the processing required for inspection, and to output an appropriate signal to 
some other part of the manufacturing process. 

Since a global thresholding function can be incorporated in most digitizers as part of 
the data acquisition process, no additional time is needed to generate a binary image. 
That is, we assume that the digitizer outputs the image in binary form. The next step is 
to isolate the data corresponding to a complete flag. Given the imaging geometry and 
frame rate discussed above, four basic binary image configurations are expected: ( 1 ) part 
of a flag on the left of the image, followed by a whole flag, followed by another partial 
flag; (2) one entire flag touching the left border, followed by a second entire flag, and 
then a gap before the right border; (3) the opposite of (2); and (4) two entire flags, with 
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neither flag touching the boundary of the image. Cases (2), (3), and (4) are not likely to 
occur with any significant frequency, but we will check for each of these conditions. As 
will be seen below. Cases (2) and (3) can be handled the same as Case (1 ), but, given the 
tight bounds on processing time, the output each time Case (4) occurs will be to reject 
both flags. 

To handle Case (1 ) we have to identify a whole flag lying between two partial flags. One 
of the quickest ways to do this is to run a window as long as the image vertically, but nar- 
row in the horizontal direction, say, corresponding to 0.35 in. (based on the window size 
1/2 of [12.8 — 12.1]), which is approximately (0.35) (640) /12.8 ~ 17 pixels wide. This 
window is used look for a significant gap between a high count of 1 ’s, and it is narrow 
enough to detect Case (4). For Case (1), this approach will produce high counts starting 
on the left of the image, then drop to very few counts (corresponding to the background) 
for about two inches, pick up again as the center (whole flag) is encountered, go like this 
for about five inches, drop again for about two inches as the next gap is encountered, 
then pick up again until the right border is encountered. The 1 ’s between the two inner 
gaps correspond to a complete flag and are processed further by the methods discussed 
below; the other l’s are ignored. (A more elegant and potentially more rugged way is 
to determine all connected components first, and then look for vertical gaps, but time 
and cost are fundamental here). Cases (2) and (3) are handled in a similar manner with 
slightly different logic, being careful to isolate the data corresponding to an entire flag 
(i.e., the flag with a gap on each side). Case (4) corresponds to a gap-data-gap-data-gap 
sequence, but, as mentioned above, it is likely that time and cost constraints would dic- 
tate rejecting both flags as a more economical approach than increasing the complexity 
of the system to handle this special case. Note that this approach to extracting 1 ’s is 
based on the assumption that the background is not excessively noisy. In other words, 
the imaging set up must be such that the background is reliably segmented as black, with 
acceptable noise. 

With reference to Fig. 1.23, the preceding discussion has carried us through the seg- 
mentation stage. The approach followed here for description, recognition, and the use 
of knowledge, is twofold. For the stars we use connected component analysis. For the 
stripes we use signature analysis. The system knows the coordinates of two vertical 
lines which contain the whole flag between them. First, we do a connected components 
analysis on the left half of the region (to save time) and filter out all components smaller 
and larger than the expected size of stars, say (to give some flexibility), all components 
less than 9 (3 x 3) pixels and larger than 64 (8 x 8) pixels. The simplest test at this 
point is to count the number of remaining connected components (which we assume to 
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be stars). If the number is 50 we continue with the next test on the stripes. If the number 
is less than 50 we reject the flag. Of course, the logic can be made much more compli- 
cated than this. For instance, it could include a regularity analysis in which the relative 
locations of the components are analyzed. There are likely to be as many answers here 
as there are students in the class, but the key objective should be to base the analysis on 
a rugged method such as connected component analysis. 

To analyze the stripes, we assume that the flags are printed on white stock material. 
Thus, "dropping a stripe” means creating a white stripe twice as wide as normal. This 
is a simple defect detectable by running a vertical scan line in an area guaranteed to 
contain stripes, and then looking at the gray-level signature for the number of pulses of 
the right height and duration. The fact that the data is binary helps in this regard, but the 
scan line should be preprocessed to bridge small gaps due to noise before it is analyzed. 
In spite of the ±15° variation in direction, a region, say, lin. to the right of the blue 
region is independent enough of the rotational variation in terms of showing only stripes 
along a scan line run vertically in that region. 

It is important that any answer to this problem show awareness of the limits in available 
computation time. Since no mention is made in the problem statement about available 
processors, it is not possible to establish with absolute certainty if a solution will meet 
the requirements or not. However, the student should be expected to address this issue. 
The guidelines given in the preceding solution are among the fastest ways to solve the 
problem. A solution along these lines, and a mention that multiple systems may be 
required if a single system cannot meet the specifications, is an acceptable solution to 
the problem. 
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